Financial News Topic Classifier

This model is a fine-tuned BERT-based classifier for financial news topic classification based on fuchenru/Trading-Hero-LLM, supporting 20 distinct financial topics. It is designed for use in financial NLP applications, news analytics, and automated trading systems.

Model Description

  • Architecture: BERT (for sequence classification)
  • Framework: PyTorch, Transformers
  • Topics: 20 financial news categories (see below)
  • License: MIT

Intended Uses & Limitations

  • Intended Use:
    • Classify financial news headlines or short texts into one of 20 financial topics.
    • Use in financial analytics, news monitoring, and trading agent pipelines.
  • Limitations:
    • Trained on zeroshot/twitter-financial-news-topic; may not generalize to all financial news sources.
    • Not suitable for non-financial or long-form text.

Topics

ID Topic
0 Analyst Update
1 Fed | Central Banks
2 Company | Product News
3 Treasuries | Corporate Debt
4 Dividend
5 Earnings
6 Energy | Oil
7 Financials
8 Currencies
9 General News | Opinion
10 Gold | Metals | Materials
11 IPO
12 Legal | Regulation
13 M&A | Investments
14 Macro
15 Markets
16 Politics
17 Personnel Change
18 Stock Commentary
19 Stock Movement

Example Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline

tokenizer = AutoTokenizer.from_pretrained("leonas5555/finnews-topic-single-classify")
model = AutoModelForSequenceClassification.from_pretrained("leonas5555/finnews-topic-single-classify")

nlp = pipeline("text-classification", model=model, tokenizer=tokenizer)

# Example text
text = "LIVE: ECB surprises with 50bps hike, ending its negative rate era. President Christine Lagarde is taking questions"

result = nlp(text)
print(result)
# Output: [{'label': 'Fed | Central Banks', 'score': 0.98}]

Example Inputs & Outputs

Example Text Predicted Topic
"Here are Thursday's biggest analyst calls: Apple, Amazon, Tesla, Palantir, DocuSign, Exxon & more" Analyst Update
"LIVE: ECB surprises with 50bps hike, ending its negative rate era." Fed | Central Banks
"Goldman Sachs traders countered the industry's underwriting slump with revenue gains that raced past analysts' estimates." Company | Product News
"China Evergrande Group's onshore bond holders rejected a plan by the distressed developer to further extend a bond payment." Treasuries | Corporate Debt
"Investing Club: Morgan Stanley's dividend, buyback pay us for our patience after quarterly missteps" Dividend

Training Data

  • Dataset: zeroshot/twitter-financial-news-topic
  • Size: 21 107 samples
  • Class Distribution: Unbalanced; class weights used during training.

Training Procedure

  • Framework: HuggingFace Transformers (Trainer API)

  • Arguments:

    • num_train_epochs: 10
    • per_device_train_batch_size: 32
    • per_device_eval_batch_size: 32
    • gradient_accumulation_steps: 1
    • learning_rate: 2e-5
    • fp16: True (Native AMP mixed precision)
    • warmup_ratio: 0.1
    • label_smoothing_factor: 0.05
    • max_grad_norm: 1.0
    • max_length: 256
    • evaluation_strategy: "steps"
    • save_strategy: "steps"
    • save_total_limit: 3
    • load_best_model_at_end: True
    • metric_for_best_model: "f1"
    • run_name: "topic_classifier"
    • seed: 42
  • Early Stopping: Patience of 2 evaluation steps (via EarlyStoppingCallback)

  • Optimizer: Adam (betas=(0.9, 0.999), epsilon=1e-08)

  • Scheduler: Linear

  • Metrics: F1 (for best model selection), plus accuracy, precision, recall

Evaluation Results

Step Training Loss Validation Loss Accuracy Precision Recall F1
530 1.965800 0.917674 0.805684 0.743887 0.691372 0.696721
1060 0.733100 0.684078 0.876366 0.815078 0.823771 0.817982
1590 0.512200 0.638335 0.895312 0.895471 0.893691 0.893341
2120 0.418200 0.682780 0.894826 0.880995 0.885067 0.880227
2650 0.380200 0.683890 0.902113 0.890379 0.901867 0.894882
3180 0.359500 0.696923 0.902599 0.881292 0.902299 0.888526
3710 0.348800 0.691665 0.906000 0.891074 0.902236 0.895001
4240 0.342900 0.687194 0.906728 0.896421 0.900574 0.896865
4770 0.339900 0.705139 0.904785 0.892559 0.903573 0.896804
5300 0.337400 0.697512 0.907943 0.897653 0.903964 0.899527

ONNX Export

An ONNX version of this model {TBD} for use with high-performance inference engines such as Infinity.

optimum-cli export onnx -m leonas5555/finnews-topic-single-classify

License

MIT

Inspired by:


References:

Downloads last month
11
Safetensors
Model size
0.1B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Evaluation results