sb_clustering_topics
This is a BERTopic model. BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets.
Usage
To use this model, please install BERTopic:
pip install -U bertopic
You can use the model as follows:
from bertopic import BERTopic
topic_model = BERTopic.load("Thabet/sb_clustering_topics")
topic_model.get_topic_info()
Topic overview
- Number of topics: 40
- Number of training documents: 1636
Click here for an overview of all topics.
Topic ID | Topic Keywords | Topic Frequency | Label |
---|---|---|---|
-1 | jt - actu - fte - ville - 57 | 11 | -1_jt_actu_fte_ville |
0 | vef - invit - invite - portrait - diff | 613 | 0_vef_invit_invite_portrait |
1 | foot - football - ligue - fc - match | 62 | 1_foot_football_ligue_fc |
2 | festival - jazz - dition - pommiers - jazz pommiers | 59 | 2_festival_jazz_dition_pommiers |
3 | renvoi - college - collge - lyce - lcole | 59 | 3_renvoi_college_collge_lyce |
4 | tribunal - proces - procs - affaire - permis conduire | 56 | 4_tribunal_proces_procs_affaire |
5 | tourisme - weekend - ascension - lascension - weekend ascension | 48 | 5_tourisme_weekend_ascension_lascension |
6 | urgences - non - soignants non - soignants - non vaccins | 48 | 6_urgences_non_soignants non_soignants |
7 | muse - expo - chteau - chateau - monument | 44 | 7_muse_expo_chteau_chateau |
8 | eau - deau - leau - eaux - qualite | 42 | 8_eau_deau_leau_eaux |
9 | culture - teaser chronique - chronique - teaser - jour | 39 | 9_culture_teaser chronique_chronique_teaser |
10 | homophobie - contre - lgbt - contre lhomophobie - lhomophobie | 36 | 10_homophobie_contre_lgbt_contre lhomophobie |
11 | basket - d69 basket - asvel - fminin villeneuve - finale | 33 | 11_basket_d69 basket_asvel_fminin villeneuve |
12 | rugby - mont marsan - marsan - dublin - finale dublin | 31 | 12_rugby_mont marsan_marsan_dublin |
13 | roues folie - roues - srie roues - folie - srie | 29 | 13_roues folie_roues_srie roues_folie |
14 | grve - sncf - brve - sncf dijon - breve | 27 | 14_grve_sncf_brve_sncf dijon |
15 | rue - rue pierre - parking - mauroy - pierre mauroy | 26 | 15_rue_rue pierre_parking_mauroy |
16 | ouvrier - ouvrier france - serie - france - meilleur ouvrier | 23 | 16_ouvrier_ouvrier france_serie_france |
17 | feux - agricoles - vols - agriculteurs - vols gps | 23 | 17_feux_agricoles_vols_agriculteurs |
18 | vertbaudet - centre - commerants - commerce - centreville | 22 | 18_vertbaudet_centre_commerants_commerce |
19 | archives - policier - policiers - congrs ps - politique | 22 | 19_archives_policier_policiers_congrs ps |
20 | recyclage - made in - made - transforme - carton | 22 | 20_recyclage_made in_made_transforme |
21 | cvdl - trail - routes - route - cvdl invite | 21 | 21_cvdl_trail_routes_route |
22 | sniors - ans - dune - maison - secondaires | 20 | 22_sniors_ans_dune_maison |
23 | sports - sport - loc sport - loc - aim | 20 | 23_sports_sport_loc sport_loc |
24 | cannes - festival cannes - festival - cannes festival - d06 | 18 | 24_cannes_festival cannes_festival_cannes festival |
25 | maires - maire - dmission - maire veyrac - dep dmission | 17 | 25_maires_maire_dmission_maire veyrac |
26 | solidaire - bo - bouquinerie solidaire - rdvcv - rdvcv bo | 16 | 26_solidaire_bo_bouquinerie solidaire_rdvcv |
27 | armada - vins - mer - larmada - maritime | 15 | 27_armada_vins_mer_larmada |
28 | accident - accident mortel - mortel - fayssal - mortel minibus | 15 | 28_accident_accident mortel_mortel_fayssal |
29 | dunkerque - jours dunkerque - jours - tape - dunkerque tape | 14 | 29_dunkerque_jours dunkerque_jours_tape |
30 | armes anciennes - participants - twirling bton - twirling - convention | 13 | 30_armes anciennes_participants_twirling bton_twirling |
31 | cpop - carmina - eyes - planete - savaoo application | 12 | 31_cpop_carmina_eyes_planete |
32 | secheresse - scurit - levage - limplantation - poules salmagne | 12 | 32_secheresse_scurit_levage_limplantation |
33 | bio - aides - d86 - hugues bioret - dossier presse | 12 | 33_bio_aides_d86_hugues bioret |
34 | collectif - camping - collecte - infimiers libraux - sr d51 | 12 | 34_collectif_camping_collecte_infimiers libraux |
35 | grand prix - prix - grand - pau - race | 11 | 35_grand prix_prix_grand_pau |
36 | boom - technique - entreprise - emploi open - futurs | 11 | 36_boom_technique_entreprise_emploi open |
37 | championnat - escrime - 57 - championnat escrime - 57 championnat | 11 | 37_championnat_escrime_57_championnat escrime |
38 | oiseaux - population - oiseaux lpo - lpo - gl69 oorion | 11 | 38_oiseaux_population_oiseaux lpo_lpo |
Training hyperparameters
- calculate_probabilities: True
- language: english
- low_memory: False
- min_topic_size: 10
- n_gram_range: (1, 1)
- nr_topics: None
- seed_topic_list: None
- top_n_words: 10
- verbose: False
Framework versions
- Numpy: 1.23.5
- HDBSCAN: 0.8.33
- UMAP: 0.5.3
- Pandas: 1.5.3
- Scikit-Learn: 1.2.2
- Sentence-transformers: 2.2.2
- Transformers: 4.31.0
- Numba: 0.56.4
- Plotly: 5.15.0
- Python: 3.10.12
- Downloads last month
- 2
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.