--- tags: - bertopic library_name: bertopic pipeline_tag: text-classification --- # ISSR_Dark_Web_7Topics This is a [BERTopic](https://github.com/MaartenGr/BERTopic) model. BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets. ## Usage To use this model, please install BERTopic: ``` pip install -U bertopic ``` You can use the model as follows: ```python from bertopic import BERTopic topic_model = BERTopic.load("D0men1c0/ISSR_Dark_Web_7Topics") topic_model.get_topic_info() ``` You can make predictions as follows: ```python sentence = ['closed market'] topic, _ = topic_model.transform(sentence) topic_model.get_topic_info(topic[0]) ``` ## Topic overview * Number of topics: 8 * Number of training documents: 65529
Click here for an overview of all topics. | Topic ID | Topic Keywords | Topic Frequency | Label | |----------|----------------|-----------------|-------| | -1 | anyone - new - help - free - please | 2823 | -1_anyone_new_help_free | | 0 | weed - xanax - vendor - cocaine - mg | 27613 | Drug Vendor Europe | | 1 | market - empire - dream - nightmare - vendor | 8645 | Dream Vendor Nightmare | | 2 | vendor - scammer - scam - looking - scamming | 6236 | Trusted Vendor Scams | | 3 | review - vendor review - vendor - review vendor - review review | 6907 | Vendor MDMA Review | | 4 | mdma - lsd - get - looking - wsm | 4230 | Drug Discussion | | 5 | order - package - shipping - delivery - pack | 6299 | Order Shipping & Tracking | | 6 | bitcoin - card - wallet - btc - bank | 2776 | Financial Services and Products |
## Training hyperparameters * calculate_probabilities: False * language: None * low_memory: False * min_topic_size: 10 * n_gram_range: (1, 2) * nr_topics: None * seed_topic_list: [['tor site', 'drug', 'cocaine', 'ketamine', 'weed', 'trafficking', 'scammer', 'market', 'vendor', 'bitcoin', 'mdma', 'coke', 'lsd', 'heroine', 'xanax', 'tor node', 'tor site', 'gun', 'weapon', 'hacking']] * top_n_words: 10 * verbose: True * zeroshot_min_similarity: 0.05 * zeroshot_topic_list: [['burglary', 'buy drugs', 'buy weapons', 'child abuse', 'check sale', 'corruption', 'counterfeit money', 'drugs', 'espionage', 'fake IDs', 'find vendor', 'fraud', 'gun', 'hacking', 'kidnapping', 'murder', 'organ trafficking', 'pedophilia', 'rape', 'scammer', 'sell drugs', 'terrorism', 'trafficking']] ## Framework versions * Numpy: 1.26.4 * HDBSCAN: 0.8.36 * UMAP: 0.5.6 * Pandas: 2.2.1 * Scikit-Learn: 1.4.1.post1 * Sentence-transformers: 3.0.1 * Transformers: 4.39.3 * Numba: 0.60.0 * Plotly: 5.22.0 * Python: 3.12.2