xsum_55555_3000_1500_train
This is a BERTopic model. BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets.
Usage
To use this model, please install BERTopic:
pip install -U bertopic
You can use the model as follows:
from bertopic import BERTopic
topic_model = BERTopic.load("KingKazma/xsum_55555_3000_1500_train")
topic_model.get_topic_info()
Topic overview
- Number of topics: 54
- Number of training documents: 3000
Click here for an overview of all topics.
Topic ID | Topic Keywords | Topic Frequency | Label |
---|---|---|---|
-1 | said - people - would - mr - year | 6 | -1_said_people_would_mr |
0 | party - eu - labour - vote - brexit | 1465 | 0_party_eu_labour_vote |
1 | trump - mr - president - republican - russia | 129 | 1_trump_mr_president_republican |
2 | care - health - nhs - patient - hospital | 76 | 2_care_health_nhs_patient |
3 | syria - syrian - attack - killed - force | 75 | 3_syria_syrian_attack_killed |
4 | cricket - wicket - england - test - ball | 64 | 4_cricket_wicket_england_test |
5 | club - league - season - appearance - loan | 59 | 5_club_league_season_appearance |
6 | wales - rugby - england - game - player | 58 | 6_wales_rugby_england_game |
7 | film - show - actor - actress - star | 55 | 7_film_show_actor_actress |
8 | medal - sport - olympic - gold - world | 54 | 8_medal_sport_olympic_gold |
9 | driving - driver - crash - car - road | 48 | 9_driving_driver_crash_car |
10 | chelsea - arsenal - city - goal - tottenham | 44 | 10_chelsea_arsenal_city_goal |
11 | president - mr - petrobras - odebrecht - government | 43 | 11_president_mr_petrobras_odebrecht |
12 | lifeboat - sea - rnli - ship - boat | 41 | 12_lifeboat_sea_rnli_ship |
13 | crime - police - child - force - abuse | 37 | 13_crime_police_child_force |
14 | man - police - men - wearing - arrested | 35 | 14_man_police_men_wearing |
15 | murray - seed - match - slam - set | 34 | 15_murray_seed_match_slam |
16 | dog - mountain - animal - avalanche - said | 34 | 16_dog_mountain_animal_avalanche |
17 | court - sexual - assault - trial - woman | 31 | 17_court_sexual_assault_trial |
18 | school - education - teacher - academy - pupil | 30 | 18_school_education_teacher_academy |
19 | fifa - ghana - burkina - african - cup | 29 | 19_fifa_ghana_burkina_african |
20 | music - album - song - like - im | 28 | 20_music_album_song_like |
21 | fire - blaze - rescue - said - building | 28 | 21_fire_blaze_rescue_said |
22 | energy - gas - shale - project - power | 27 | 22_energy_gas_shale_project |
23 | train - rail - bridge - scotrail - strike | 27 | 23_train_rail_bridge_scotrail |
24 | growth - rate - oil - market - us | 26 | 24_growth_rate_oil_market |
25 | town - foul - box - footed - half | 26 | 25_town_foul_box_footed |
26 | open - round - golf - par - birdie | 26 | 26_open_round_golf_par |
27 | china - north - chinese - xi - taiwan | 22 | 27_china_north_chinese_xi |
28 | bond - bank - greek - greece - eurozone | 22 | 28_bond_bank_greek_greece |
29 | race - lap - second - honda - driver | 21 | 29_race_lap_second_honda |
30 | president - mr - congolese - africa - african | 21 | 30_president_mr_congolese_africa |
31 | barcelona - fc - madrid - de - bayern | 19 | 31_barcelona_fc_madrid_de |
32 | murder - man - postmortem - court - found | 18 | 32_murder_man_postmortem_court |
33 | welsh - wales - government - assembly - labour | 17 | 33_welsh_wales_government_assembly |
34 | celtic - game - season - rangers - team | 17 | 34_celtic_game_season_rangers |
35 | heritage - castle - house - orkney - building | 17 | 35_heritage_castle_house_orkney |
36 | tax - deficit - debt - economy - financial | 16 | 36_tax_deficit_debt_economy |
37 | stream - jet - weather - wind - flood | 15 | 37_stream_jet_weather_wind |
38 | software - security - data - hacker - router | 15 | 38_software_security_data_hacker |
39 | painting - portrait - art - collection - artist | 14 | 39_painting_portrait_art_collection |
40 | apple - tablet - hp - firm - android | 14 | 40_apple_tablet_hp_firm |
41 | robertson - mr - court - knife - murder | 12 | 41_robertson_mr_court_knife |
42 | unsupported - device - updated - playback - media | 12 | 42_unsupported_device_updated_playback |
43 | iaaf - doping - athlete - athletics - antidoping | 11 | 43_iaaf_doping_athlete_athletics |
44 | stolen - theft - burglary - thief - store | 11 | 44_stolen_theft_burglary_thief |
45 | yn - ar - mae - bod - ei | 11 | 45_yn_ar_mae_bod |
46 | flight - plane - airport - aircraft - passenger | 11 | 46_flight_plane_airport_aircraft |
47 | baby - child - infant - mcelhinney - church | 10 | 47_baby_child_infant_mcelhinney |
48 | party - fillon - mr - socialist - macron | 10 | 48_party_fillon_mr_socialist |
49 | serbia - scotland - celtic - throwin - kick | 9 | 49_serbia_scotland_celtic_throwin |
50 | child - childcare - families - mental - nurse | 8 | 50_child_childcare_families_mental |
51 | turkey - migrant - eu - visa - greece | 6 | 51_turkey_migrant_eu_visa |
52 | supermarket - store - price - sale - tyrrells | 6 | 52_supermarket_store_price_sale |
Training hyperparameters
- calculate_probabilities: True
- language: english
- low_memory: False
- min_topic_size: 10
- n_gram_range: (1, 1)
- nr_topics: None
- seed_topic_list: None
- top_n_words: 10
- verbose: False
Framework versions
- Numpy: 1.22.4
- HDBSCAN: 0.8.33
- UMAP: 0.5.3
- Pandas: 1.5.3
- Scikit-Learn: 1.2.2
- Sentence-transformers: 2.2.2
- Transformers: 4.31.0
- Numba: 0.57.1
- Plotly: 5.13.1
- Python: 3.10.12
- Downloads last month
- 8
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.