--- license: mit language: - ru metrics: - accuracy pipeline_tag: text-classification widget: - text: "Взрыв газа произошел в 2-этажном доме в поселке под Казанью, пострадали четыре человека, сообщает МЧС" example_title: "Новость" - text: "Сын поздравил меня с днём рождения стихами ❤️" example_title: "Не новость" --- ## Model Details ### Model Description News_classifier is a fine-tuned model designed for binary classifying (news/not news) from various Russian-language Telegram channels. This model can be integrated into a news aggregation service. - **Model type:** Sentence RuBERT (Russian, cased, 12-layer, 768-hidden, 12-heads, 180M parameters) - **Language(s):** russian (ru) - **License:** mit - **Finetuned from model:** `DeepPavlov/rubert-base-cased-sentence` ## Dataset - Russian telegram posts - train/valid/test: 2970/165/165 ## Training Details - token max length: 512 - num labels: 2 - batch size: 16 - learning rate: 2e-5 - train epochs: 20 - weight decay: 0.01 ## Metrics: - Matthews_correlation (training evaluation metric): 0.89 - Accuracy: 0.95 ## Label Scheme - LABEL_1 - news - LABEL_0 - not news