Add BERTopic model
Browse files- README.md +204 -0
- config.json +17 -0
- topic_embeddings.safetensors +3 -0
- topics.json +0 -0
README.md
ADDED
@@ -0,0 +1,204 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
|
2 |
+
---
|
3 |
+
tags:
|
4 |
+
- bertopic
|
5 |
+
library_name: bertopic
|
6 |
+
pipeline_tag: text-classification
|
7 |
+
---
|
8 |
+
|
9 |
+
# bertopic-20-newsgroups
|
10 |
+
|
11 |
+
This is a [BERTopic](https://github.com/MaartenGr/BERTopic) model.
|
12 |
+
BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets.
|
13 |
+
|
14 |
+
## Usage
|
15 |
+
|
16 |
+
To use this model, please install BERTopic:
|
17 |
+
|
18 |
+
```
|
19 |
+
pip install -U bertopic
|
20 |
+
```
|
21 |
+
|
22 |
+
You can use the model as follows:
|
23 |
+
|
24 |
+
```python
|
25 |
+
from bertopic import BERTopic
|
26 |
+
topic_model = BERTopic.load("ctam8736/bertopic-20-newsgroups")
|
27 |
+
|
28 |
+
topic_model.get_topic_info()
|
29 |
+
```
|
30 |
+
|
31 |
+
## Topic overview
|
32 |
+
|
33 |
+
* Number of topics: 135
|
34 |
+
* Number of training documents: 11314
|
35 |
+
|
36 |
+
<details>
|
37 |
+
<summary>Click here for an overview of all topics.</summary>
|
38 |
+
|
39 |
+
| Topic ID | Topic Keywords | Topic Frequency | Label |
|
40 |
+
|----------|----------------|-----------------|-------|
|
41 |
+
| -1 | article - information - subject - re - what | 10 | -1_article_information_subject_re |
|
42 |
+
| 0 | scsi - scsi2 - scsi1 - drives - bios | 3737 | 0_scsi_scsi2_scsi1_drives |
|
43 |
+
| 1 | nhl - puck - leafs - flyers - pitching | 976 | 1_nhl_puck_leafs_flyers |
|
44 |
+
| 2 | firearm - firearms - handgun - guns - gun | 918 | 2_firearm_firearms_handgun_guns |
|
45 |
+
| 3 | ford - honda - nissan - bmw - dealer | 409 | 3_ford_honda_nissan_bmw |
|
46 |
+
| 4 | encryption - encrypted - crypto - nsa - chip | 387 | 4_encryption_encrypted_crypto_nsa |
|
47 |
+
| 5 | atheism - atheist - atheists - christianity - belief | 377 | 5_atheism_atheist_atheists_christianity |
|
48 |
+
| 6 | hezbollah - gaza - lebanon - palestinians - lebanese | 342 | 6_hezbollah_gaza_lebanon_palestinians |
|
49 |
+
| 7 | window - x11r5 - openwindows - x11 - x11r4 | 249 | 7_window_x11r5_openwindows_x11 |
|
50 |
+
| 8 | modems - modem - mouse - ports - port | 243 | 8_modems_modem_mouse_ports |
|
51 |
+
| 9 | anonymity - anonymous - mailing - usenet - newsgroups | 151 | 9_anonymity_anonymous_mailing_usenet |
|
52 |
+
| 10 | armenians - armenia - armenian - azerbaijani - azerbaijan | 147 | 10_armenians_armenia_armenian_azerbaijani |
|
53 |
+
| 11 | clinton - stephanopoulos - secretary - president - congress | 135 | 11_clinton_stephanopoulos_secretary_president |
|
54 |
+
| 12 | os - windows - win32 - microsoft - win31 | 133 | 12_os_windows_win32_microsoft |
|
55 |
+
| 13 | diseases - disease - candida - infection - infections | 113 | 13_diseases_disease_candida_infection |
|
56 |
+
| 14 | superstition - msg - sensitivity - glutamate - causes | 100 | 14_superstition_msg_sensitivity_glutamate |
|
57 |
+
| 15 | laserjet - inkjet - printers - bubblejet - bubblejets | 86 | 15_laserjet_inkjet_printers_bubblejet |
|
58 |
+
| 16 | billboard - billboards - nasa - space - advertising | 75 | 16_billboard_billboards_nasa_space |
|
59 |
+
| 17 | radar - detectors - detector - detecting - radarjust | 68 | 17_radar_detectors_detector_detecting |
|
60 |
+
| 18 | speeding - speeds - mph - speed - driving | 64 | 18_speeding_speeds_mph_speed |
|
61 |
+
| 19 | ssto - moonbase - moon - lunar - billion | 63 | 19_ssto_moonbase_moon_lunar |
|
62 |
+
| 20 | station - nasa - redesign - space - shuttle | 61 | 20_station_nasa_redesign_space |
|
63 |
+
| 21 | eternity - afterlife - heaven - hell - judgement | 49 | 21_eternity_afterlife_heaven_hell |
|
64 |
+
| 22 | testament - manuscripts - scripture - bible - hebrew | 47 | 22_testament_manuscripts_scripture_bible |
|
65 |
+
| 23 | homosexuality - heterosexual - homosexual - homosexuals - gays | 45 | 23_homosexuality_heterosexual_homosexual_homosexuals |
|
66 |
+
| 24 | libertarians - libertarian - libertarianism - regulation - governments | 44 | 24_libertarians_libertarian_libertarianism_regulation |
|
67 |
+
| 25 | islamic - muslim - islam - muslims - koran | 44 | 25_islamic_muslim_islam_muslims |
|
68 |
+
| 26 | tax - taxes - vat - deficits - income | 44 | 26_tax_taxes_vat_deficits |
|
69 |
+
| 27 | oil - drain - engine - fuel - dumping | 44 | 27_oil_drain_engine_fuel |
|
70 |
+
| 28 | helmet - helmets - head - protection - gloves | 43 | 28_helmet_helmets_head_protection |
|
71 |
+
| 29 | fonts - font - ttfonts - truetype - printing | 42 | 29_fonts_font_ttfonts_truetype |
|
72 |
+
| 30 | morality - moral - morals - instinctive - immoral | 39 | 30_morality_moral_morals_instinctive |
|
73 |
+
| 31 | colormaps - colourmap - colormap - xalloccolor - cwcolormap | 39 | 31_colormaps_colourmap_colormap_xalloccolor |
|
74 |
+
| 32 | homosexuals - molesters - homosexual - homosexuality - pedophilia | 38 | 32_homosexuals_molesters_homosexual_homosexuality |
|
75 |
+
| 33 | migraine - migraines - headache - headaches - analgesics | 37 | 33_migraine_migraines_headache_headaches |
|
76 |
+
| 34 | resurrection - gospels - tomb - testament - jesuss | 37 | 34_resurrection_gospels_tomb_testament |
|
77 |
+
| 35 | graphics - copyright - images - siggraph - image | 37 | 35_graphics_copyright_images_siggraph |
|
78 |
+
| 36 | mormon - mormons - lds - brigham - utah | 35 | 36_mormon_mormons_lds_brigham |
|
79 |
+
| 37 | scientific - scipsychology - scientist - science - methodology | 34 | 37_scientific_scipsychology_scientist_science |
|
80 |
+
| 38 | tapes - tape - backup - copy - floppy | 34 | 38_tapes_tape_backup_copy |
|
81 |
+
| 39 | drugs - marijuana - drug - legalizing - legalization | 34 | 39_drugs_marijuana_drug_legalizing |
|
82 |
+
| 40 | punishment - punish - murder - penalty - murderer | 34 | 40_punishment_punish_murder_penalty |
|
83 |
+
| 41 | sphere - globe - radius - pointstruct - circle | 34 | 41_sphere_globe_radius_pointstruct |
|
84 |
+
| 42 | surgery - patients - hernia - massager - pain | 33 | 42_surgery_patients_hernia_massager |
|
85 |
+
| 43 | genocide - bosnia - atheism - serbs - christians | 32 | 43_genocide_bosnia_atheism_serbs |
|
86 |
+
| 44 | insurance - liability - insureyear - deductible - accident | 32 | 44_insurance_liability_insureyear_deductible |
|
87 |
+
| 45 | polygon - polygons - triangulation - hexagons - polyn | 30 | 45_polygon_polygons_triangulation_hexagons |
|
88 |
+
| 46 | spacecraft - galileo - galileos - mission - magellan | 29 | 46_spacecraft_galileo_galileos_mission |
|
89 |
+
| 47 | countersteering - countersteeringfaq - countersteer - riding - bikes | 29 | 47_countersteering_countersteeringfaq_countersteer_riding |
|
90 |
+
| 48 | antenna - antennas - transmitters - transmitting - radios | 28 | 48_antenna_antennas_transmitters_transmitting |
|
91 |
+
| 49 | canine - dogs - dog - spaniel - springer | 28 | 49_canine_dogs_dog_spaniel |
|
92 |
+
| 50 | batteries - battery - electrolyte - galvanized - zinc | 28 | 50_batteries_battery_electrolyte_galvanized |
|
93 |
+
| 51 | oscilloscope - scopes - scope - oscilliscopes - digital | 27 | 51_oscilloscope_scopes_scope_oscilliscopes |
|
94 |
+
| 52 | xgrabkey - definekeys - accelerators - accelerator - shiftkeyq | 27 | 52_xgrabkey_definekeys_accelerators_accelerator |
|
95 |
+
| 53 | protoncentaur - centaur - proton - accelerator - nuclear | 27 | 53_protoncentaur_centaur_proton_accelerator |
|
96 |
+
| 54 | telephone - dial - phone - call - lines | 26 | 54_telephone_dial_phone_call |
|
97 |
+
| 55 | marriages - wedding - vows - weddings - marriage | 25 | 55_marriages_wedding_vows_weddings |
|
98 |
+
| 56 | ibm - levels - level - nasa - software | 25 | 56_ibm_levels_level_nasa |
|
99 |
+
| 57 | nasa - aerospace - astronomy - spacecraft - astronomical | 24 | 57_nasa_aerospace_astronomy_spacecraft |
|
100 |
+
| 58 | motif - neosoft - unix - platforms - software | 24 | 58_motif_neosoft_unix_platforms |
|
101 |
+
| 59 | nuclear - cooling - reactor - tower - towers | 23 | 59_nuclear_cooling_reactor_tower |
|
102 |
+
| 60 | injuries - struck - snot - rocks - warningplease | 23 | 60_injuries_struck_snot_rocks |
|
103 |
+
| 61 | transmissions - shifter - automatics - autos - auto | 22 | 61_transmissions_shifter_automatics_autos |
|
104 |
+
| 62 | lzr1260 - printing - mwt9caxaxaxaxaxaxaxaxaxaxaxaxax - m9l0qaxaxaxaxaxaxaxaxaxaxaxaxaxax - mi68qaxaxaxaxaxaxaxaxaxaxaxaxaxax | 22 | 62_lzr1260_printing_mwt9caxaxaxaxaxaxaxaxaxaxaxaxax_m9l0qaxaxaxaxaxaxaxaxaxaxaxaxaxax |
|
105 |
+
| 63 | cview - files - directory - file - tmp | 21 | 63_cview_files_directory_file |
|
106 |
+
| 64 | immaculate - mary - marys - conception - catholics | 21 | 64_immaculate_mary_marys_conception |
|
107 |
+
| 65 | cryptology - cryptanalyst - crypt - cryptanalysis - ciphers | 20 | 65_cryptology_cryptanalyst_crypt_cryptanalysis |
|
108 |
+
| 66 | hotelco - hotels - resorts - hotel - tickets | 20 | 66_hotelco_hotels_resorts_hotel |
|
109 |
+
| 67 | 3dos - 3do - 3ds - 3d - 3dstudio | 20 | 67_3dos_3do_3ds_3d |
|
110 |
+
| 68 | comet - comets - jupiter - asteroids - jovian | 20 | 68_comet_comets_jupiter_asteroids |
|
111 |
+
| 69 | polishing - scratches - paint - rubbing - glaze | 20 | 69_polishing_scratches_paint_rubbing |
|
112 |
+
| 70 | newsgroup - groups - groupsplit - group - split | 20 | 70_newsgroup_groups_groupsplit_group |
|
113 |
+
| 71 | koresh - koreshs - david - sermon - biblical | 20 | 71_koresh_koreshs_david_sermon |
|
114 |
+
| 72 | parking - parked - liability - unsafe - stickers | 20 | 72_parking_parked_liability_unsafe |
|
115 |
+
| 73 | trumpet - tcp - windows - winqvtnet - winsock | 19 | 73_trumpet_tcp_windows_winqvtnet |
|
116 |
+
| 74 | freon - heater - coolant - r12 - vents | 19 | 74_freon_heater_coolant_r12 |
|
117 |
+
| 75 | sabbath - commandments - sunday - worship - church | 19 | 75_sabbath_commandments_sunday_worship |
|
118 |
+
| 76 | geekdom - computer - fourdcom - csws18icsunysbedu - psychnet | 19 | 76_geekdom_computer_fourdcom_csws18icsunysbedu |
|
119 |
+
| 77 | bosnia - serbs - sanctions - somalia - war | 18 | 77_bosnia_serbs_sanctions_somalia |
|
120 |
+
| 78 | soundblaster - midi - midimapper - soundexe - wavfiles | 18 | 78_soundblaster_midi_midimapper_soundexe |
|
121 |
+
| 79 | condo - remodeled - townhome - bedroom - rent | 18 | 79_condo_remodeled_townhome_bedroom |
|
122 |
+
| 80 | odometers - odometer - sensor - mileage - sensors | 18 | 80_odometers_odometer_sensor_mileage |
|
123 |
+
| 81 | joystick - joysticks - joyport - joyread - hardware | 17 | 81_joystick_joysticks_joyport_joyread |
|
124 |
+
| 82 | abortion - abortions - roe - proabortion - fetus | 17 | 82_abortion_abortions_roe_proabortion |
|
125 |
+
| 83 | seizures - seizure - allergies - corn - cereal | 17 | 83_seizures_seizure_allergies_corn |
|
126 |
+
| 84 | sobriety - sober - drinking - drink - drinks | 17 | 84_sobriety_sober_drinking_drink |
|
127 |
+
| 85 | nubus - lciiipowerpc - pds - powerpcs - powerpc | 17 | 85_nubus_lciiipowerpc_pds_powerpcs |
|
128 |
+
| 86 | mining - miners - minerals - miner - mineral | 17 | 86_mining_miners_minerals_miner |
|
129 |
+
| 87 | outlets - outlet - electrical - wiring - grounded | 16 | 87_outlets_outlet_electrical_wiring |
|
130 |
+
| 88 | rosicrucianum - rosicrucian - orders - order - organization | 16 | 88_rosicrucianum_rosicrucian_orders_order |
|
131 |
+
| 89 | tempest - shielding - surveillance - encryption - electromagnetic | 16 | 89_tempest_shielding_surveillance_encryption |
|
132 |
+
| 90 | monitor - monitors - screen - scrolling - display | 16 | 90_monitor_monitors_screen_scrolling |
|
133 |
+
| 91 | krillean - photographs - photography - kirlian - pictures | 16 | 91_krillean_photographs_photography_kirlian |
|
134 |
+
| 92 | scanner - scanners - scanning - scans - scanman | 16 | 92_scanner_scanners_scanning_scans |
|
135 |
+
| 93 | sexism - sexist - extramarital - islamic - marriage | 16 | 93_sexism_sexist_extramarital_islamic |
|
136 |
+
| 94 | noisy - noise - noises - rattled - quiets | 16 | 94_noisy_noise_noises_rattled |
|
137 |
+
| 95 | orion - astronomy - museum - prototype - space | 15 | 95_orion_astronomy_museum_prototype |
|
138 |
+
| 96 | easter - pagan - celebrating - feast - celebration | 15 | 96_easter_pagan_celebrating_feast |
|
139 |
+
| 97 | batf - assault - waco - blasting - blast | 15 | 97_batf_assault_waco_blasting |
|
140 |
+
| 98 | batchfile - ini - updating - file - winfileini | 15 | 98_batchfile_ini_updating_file |
|
141 |
+
| 99 | copyprotect - copying - protected - copy - protection | 15 | 99_copyprotect_copying_protected_copy |
|
142 |
+
| 100 | 42 - tiff - tiff6 - significance - universe | 14 | 100_42_tiff_tiff6_significance |
|
143 |
+
| 101 | stove - stoves - splitfires - splitfire - burns | 14 | 101_stove_stoves_splitfires_splitfire |
|
144 |
+
| 102 | automotive - backing - lights - corvette - reverse | 14 | 102_automotive_backing_lights_corvette |
|
145 |
+
| 103 | dock - docks - minidocks - portable - minidock | 14 | 103_dock_docks_minidocks_portable |
|
146 |
+
| 104 | cdaudio - stereo - audio - soundbase - speakers | 14 | 104_cdaudio_stereo_audio_soundbase |
|
147 |
+
| 105 | uv - flashlight - houselights - fluorescent - lamps | 14 | 105_uv_flashlight_houselights_fluorescent |
|
148 |
+
| 106 | papal - papacy - pope - popes - schism | 14 | 106_papal_papacy_pope_popes |
|
149 |
+
| 107 | scsi - quadra - quadras - quadraspecific - firmware | 14 | 107_scsi_quadra_quadras_quadraspecific |
|
150 |
+
| 108 | crohns - colitis - dietary - gastroenterology - diet | 13 | 108_crohns_colitis_dietary_gastroenterology |
|
151 |
+
| 109 | crashes - powerbook - plugged - corrupted - duos | 13 | 109_crashes_powerbook_plugged_corrupted |
|
152 |
+
| 110 | eyedness - handedness - righteye - righthandedness - eyes | 13 | 110_eyedness_handedness_righteye_righthandedness |
|
153 |
+
| 111 | wrench - pliers - tool - tools - srb | 13 | 111_wrench_pliers_tool_tools |
|
154 |
+
| 112 | scripture - scriptures - prophecy - revelation - revelations | 13 | 112_scripture_scriptures_prophecy_revelation |
|
155 |
+
| 113 | nikon - lens - lenses - olympus - 35mm | 13 | 113_nikon_lens_lenses_olympus |
|
156 |
+
| 114 | prosecution - suspects - encrypted - defendant - incriminate | 13 | 114_prosecution_suspects_encrypted_defendant |
|
157 |
+
| 115 | wheel - shaftdrives - wheelies - wheelie - shaftdrive | 12 | 115_wheel_shaftdrives_wheelies_wheelie |
|
158 |
+
| 116 | obesity - rebound - dieting - diet - metabolism | 12 | 116_obesity_rebound_dieting_diet |
|
159 |
+
| 117 | adl - adls - spying - fbi - investigation | 12 | 117_adl_adls_spying_fbi |
|
160 |
+
| 118 | lunar - moon - exploration - attend - conference | 12 | 118_lunar_moon_exploration_attend |
|
161 |
+
| 119 | draftees - draft - selective - military - abolished | 12 | 119_draftees_draft_selective_military |
|
162 |
+
| 120 | sunrise - sunset - daylight - algorithm - astronomical | 12 | 120_sunrise_sunset_daylight_algorithm |
|
163 |
+
| 121 | octopus - octopuses - octopi - squid - octapus | 12 | 121_octopus_octopuses_octopi_squid |
|
164 |
+
| 122 | gassing - explosion - gas - explode - explosive | 11 | 122_gassing_explosion_gas_explode |
|
165 |
+
| 123 | tutorial - handbook - chemistry - paperback - books | 11 | 123_tutorial_handbook_chemistry_paperback |
|
166 |
+
| 124 | amp - decibels - current - ampere - db | 11 | 124_amp_decibels_current_ampere |
|
167 |
+
| 125 | uniforms - jerseys - uniform - mets - reds | 11 | 125_uniforms_jerseys_uniform_mets |
|
168 |
+
| 126 | eugenics - eugenic - geneticallyengineered - genetic - genetically | 11 | 126_eugenics_eugenic_geneticallyengineered_genetic |
|
169 |
+
| 127 | fractals - fractal - fractally - compression - pascalfractals | 11 | 127_fractals_fractal_fractally_compression |
|
170 |
+
| 128 | sunview - xputimage - pixmap - pixmaps - ximage | 11 | 128_sunview_xputimage_pixmap_pixmaps |
|
171 |
+
| 129 | waving - wave - waves - bikers - bikes | 11 | 129_waving_wave_waves_bikers |
|
172 |
+
| 130 | vocoder - compressionalgorithms - compression - modems - cryptophones | 11 | 130_vocoder_compressionalgorithms_compression_modems |
|
173 |
+
| 131 | mouse - jumpiness - mousecom - mouseits - jumps | 11 | 131_mouse_jumpiness_mousecom_mouseits |
|
174 |
+
| 132 | netware - lan - workgroup - workgroups - w4wg | 10 | 132_netware_lan_workgroup_workgroups |
|
175 |
+
| 133 | timers - timer - ultralong - clock - oscillator | 10 | 133_timers_timer_ultralong_clock |
|
176 |
+
|
177 |
+
</details>
|
178 |
+
|
179 |
+
## Training hyperparameters
|
180 |
+
|
181 |
+
* calculate_probabilities: False
|
182 |
+
* language: english
|
183 |
+
* low_memory: False
|
184 |
+
* min_topic_size: 10
|
185 |
+
* n_gram_range: (1, 1)
|
186 |
+
* nr_topics: auto
|
187 |
+
* seed_topic_list: None
|
188 |
+
* top_n_words: 10
|
189 |
+
* verbose: True
|
190 |
+
* zeroshot_min_similarity: 0.7
|
191 |
+
* zeroshot_topic_list: None
|
192 |
+
|
193 |
+
## Framework versions
|
194 |
+
|
195 |
+
* Numpy: 1.23.5
|
196 |
+
* HDBSCAN: 0.8.33
|
197 |
+
* UMAP: 0.5.5
|
198 |
+
* Pandas: 2.2.1
|
199 |
+
* Scikit-Learn: 1.3.1
|
200 |
+
* Sentence-transformers: 2.5.1
|
201 |
+
* Transformers: 4.37.0.dev0
|
202 |
+
* Numba: 0.59.1
|
203 |
+
* Plotly: 5.20.0
|
204 |
+
* Python: 3.10.4
|
config.json
ADDED
@@ -0,0 +1,17 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"calculate_probabilities": false,
|
3 |
+
"language": "english",
|
4 |
+
"low_memory": false,
|
5 |
+
"min_topic_size": 10,
|
6 |
+
"n_gram_range": [
|
7 |
+
1,
|
8 |
+
1
|
9 |
+
],
|
10 |
+
"nr_topics": "auto",
|
11 |
+
"seed_topic_list": null,
|
12 |
+
"top_n_words": 10,
|
13 |
+
"verbose": true,
|
14 |
+
"zeroshot_min_similarity": 0.7,
|
15 |
+
"zeroshot_topic_list": null,
|
16 |
+
"embedding_model": "sentence-transformers/all-MiniLM-L6-v2"
|
17 |
+
}
|
topic_embeddings.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:ef0a7c8022a1ea4efebc922cfd6379c0c1fe7c46b32032e2179d648bfe0d8619
|
3 |
+
size 207448
|
topics.json
ADDED
The diff for this file is too large to render.
See raw diff
|
|