tykea commited on
Commit
76f5303
verified
1 Parent(s): a0d052c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +40 -1
README.md CHANGED
@@ -6,4 +6,43 @@ base_model:
6
  - FacebookAI/xlm-roberta-base
7
  pipeline_tag: text-classification
8
  library_name: transformers
9
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6
  - FacebookAI/xlm-roberta-base
7
  pipeline_tag: text-classification
8
  library_name: transformers
9
+ tags:
10
+ - sentiment
11
+ ---
12
+
13
+ This is a fine-tuned version of the XLM-RoBERTa model for sentiment analysis. The model was trained to classify texts into 2 categories: [Positive, Negative]. It can process texts up to 512 tokens and performs well on khmer text inputs.
14
+ - **Task**: Sentiment analysis (binary classification).
15
+ - **Languages Supported**: [List languages, e.g., English, Khmer, etc.].
16
+ - **Intended Use Cases**:
17
+ - Analyzing customer reviews.
18
+ - Social media sentiment detection.
19
+ - **Limitations**:
20
+ - Performance may degrade on languages or domains not present in the training data.
21
+ - Does not handle sarcasm or highly ambiguous inputs well.
22
+ -
23
+ The model was evaluated on a test set of [Number] samples, achieving the following performance:
24
+
25
+ - **Test Accuracy**: 83.25%
26
+ - **Precision**: 83.55%
27
+ - **Recall**: 83.25%
28
+ - **F1 Score**: 83.25%
29
+
30
+ Confusion Matrix:
31
+ | Predicted\Actual | Negative | Positive |
32
+ |-------------------|----------|----------|
33
+ | **Negative** | 166 | 42 |
34
+ | **Positive** | 25 | 167 |
35
+ The model supports a maximum sequence length of 512 tokens.
36
+ ## How to Use
37
+ ```python
38
+ from transformers import AutoTokenizer, AutoModelForSequenceClassification
39
+
40
+ tokenizer = AutoTokenizer.from_pretrained("Tykea/khmer-text-sentiment-analysis-roberta")
41
+ model = AutoModelForSequenceClassification.from_pretrained("Tykea/khmer-text-sentiment-analysis-roberta")
42
+
43
+ text = "釣⑨瀭釣会瀻CADT"
44
+ inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)
45
+ outputs = model(**inputs)
46
+ predictions = outputs.logits.argmax(dim=1)
47
+ labels_mapping = {0: 'negative', 1: 'positive'}
48
+ print("Predicted Class:", labels_mapping[predictions.item()])