Financial News Topic Classifier
This model is a fine-tuned BERT-based classifier for financial news topic classification based on fuchenru/Trading-Hero-LLM, supporting 20 distinct financial topics. It is designed for use in financial NLP applications, news analytics, and automated trading systems.
Model Description
- Architecture: BERT (for sequence classification)
- Framework: PyTorch, Transformers
- Topics: 20 financial news categories (see below)
- License: MIT
Intended Uses & Limitations
- Intended Use:
- Classify financial news headlines or short texts into one of 20 financial topics.
- Use in financial analytics, news monitoring, and trading agent pipelines.
- Limitations:
- Trained on zeroshot/twitter-financial-news-topic; may not generalize to all financial news sources.
- Not suitable for non-financial or long-form text.
Topics
| ID | Topic |
|---|---|
| 0 | Analyst Update |
| 1 | Fed | Central Banks |
| 2 | Company | Product News |
| 3 | Treasuries | Corporate Debt |
| 4 | Dividend |
| 5 | Earnings |
| 6 | Energy | Oil |
| 7 | Financials |
| 8 | Currencies |
| 9 | General News | Opinion |
| 10 | Gold | Metals | Materials |
| 11 | IPO |
| 12 | Legal | Regulation |
| 13 | M&A | Investments |
| 14 | Macro |
| 15 | Markets |
| 16 | Politics |
| 17 | Personnel Change |
| 18 | Stock Commentary |
| 19 | Stock Movement |
Example Usage
from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline
tokenizer = AutoTokenizer.from_pretrained("leonas5555/finnews-topic-single-classify")
model = AutoModelForSequenceClassification.from_pretrained("leonas5555/finnews-topic-single-classify")
nlp = pipeline("text-classification", model=model, tokenizer=tokenizer)
# Example text
text = "LIVE: ECB surprises with 50bps hike, ending its negative rate era. President Christine Lagarde is taking questions"
result = nlp(text)
print(result)
# Output: [{'label': 'Fed | Central Banks', 'score': 0.98}]
Example Inputs & Outputs
| Example Text | Predicted Topic |
|---|---|
| "Here are Thursday's biggest analyst calls: Apple, Amazon, Tesla, Palantir, DocuSign, Exxon & more" | Analyst Update |
| "LIVE: ECB surprises with 50bps hike, ending its negative rate era." | Fed | Central Banks |
| "Goldman Sachs traders countered the industry's underwriting slump with revenue gains that raced past analysts' estimates." | Company | Product News |
| "China Evergrande Group's onshore bond holders rejected a plan by the distressed developer to further extend a bond payment." | Treasuries | Corporate Debt |
| "Investing Club: Morgan Stanley's dividend, buyback pay us for our patience after quarterly missteps" | Dividend |
Training Data
- Dataset: zeroshot/twitter-financial-news-topic
- Size: 21 107 samples
- Class Distribution: Unbalanced; class weights used during training.
Training Procedure
Framework: HuggingFace Transformers (Trainer API)
Arguments:
- num_train_epochs: 10
- per_device_train_batch_size: 32
- per_device_eval_batch_size: 32
- gradient_accumulation_steps: 1
- learning_rate: 2e-5
- fp16: True (Native AMP mixed precision)
- warmup_ratio: 0.1
- label_smoothing_factor: 0.05
- max_grad_norm: 1.0
- max_length: 256
- evaluation_strategy: "steps"
- save_strategy: "steps"
- save_total_limit: 3
- load_best_model_at_end: True
- metric_for_best_model: "f1"
- run_name: "topic_classifier"
- seed: 42
Early Stopping: Patience of 2 evaluation steps (via
EarlyStoppingCallback)Optimizer: Adam (betas=(0.9, 0.999), epsilon=1e-08)
Scheduler: Linear
Metrics: F1 (for best model selection), plus accuracy, precision, recall
Evaluation Results
| Step | Training Loss | Validation Loss | Accuracy | Precision | Recall | F1 |
|---|---|---|---|---|---|---|
| 530 | 1.965800 | 0.917674 | 0.805684 | 0.743887 | 0.691372 | 0.696721 |
| 1060 | 0.733100 | 0.684078 | 0.876366 | 0.815078 | 0.823771 | 0.817982 |
| 1590 | 0.512200 | 0.638335 | 0.895312 | 0.895471 | 0.893691 | 0.893341 |
| 2120 | 0.418200 | 0.682780 | 0.894826 | 0.880995 | 0.885067 | 0.880227 |
| 2650 | 0.380200 | 0.683890 | 0.902113 | 0.890379 | 0.901867 | 0.894882 |
| 3180 | 0.359500 | 0.696923 | 0.902599 | 0.881292 | 0.902299 | 0.888526 |
| 3710 | 0.348800 | 0.691665 | 0.906000 | 0.891074 | 0.902236 | 0.895001 |
| 4240 | 0.342900 | 0.687194 | 0.906728 | 0.896421 | 0.900574 | 0.896865 |
| 4770 | 0.339900 | 0.705139 | 0.904785 | 0.892559 | 0.903573 | 0.896804 |
| 5300 | 0.337400 | 0.697512 | 0.907943 | 0.897653 | 0.903964 | 0.899527 |
ONNX Export
An ONNX version of this model {TBD} for use with high-performance inference engines such as Infinity.
optimum-cli export onnx -m leonas5555/finnews-topic-single-classify
License
MIT
Inspired by:
References:
- Downloads last month
- 11
Evaluation results
- accuracy on zeroshot/twitter-financial-news-topicself-reported0.908
- F1 on zeroshot/twitter-financial-news-topicself-reported0.900