haider0941 commited on
Commit
1a899b5
1 Parent(s): 4873fa9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +78 -3
README.md CHANGED
@@ -1,3 +1,78 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ ---
4
+ Based on the sample dataset and the purpose of your model, here's an updated model card for your fine-tuned DistilBERT model on Hugging Face:
5
+
6
+ ---
7
+
8
+ ## Model Details
9
+
10
+ - **Model Name**: DistilBERT for Educational Query Classification
11
+ - **Model Architecture**: DistilBERT (base model: `distilbert-base-uncased`)
12
+ - **Language**: English
13
+ - **Model Type**: Transformer-based text classification model
14
+ - **License**: [Apache License 2.0](https://www.apache.org/licenses/LICENSE-2.0)
15
+
16
+ ## Overview
17
+
18
+ This model is a fine-tuned version of [DistilBERT](https://huggingface.co/distilbert-base-uncased) specifically designed for classifying queries as either educational or non-educational. It was trained on a dataset containing a variety of questions and statements, with each entry labeled as either "educational" or "non-educational."
19
+
20
+ ## Intended Use
21
+
22
+ - **Primary Use Case**: This model is intended to classify text inputs into two categories: "educational" or "non-educational." It is useful for applications that need to filter out or prioritize educational content.
23
+ - **Potential Applications**:
24
+ - Educational chatbots or virtual assistants
25
+ - Content moderation for educational platforms
26
+ - Automated tagging of educational content
27
+ - Filtering non-educational queries from educational websites or apps
28
+
29
+ ## Training Data
30
+
31
+ - **Dataset**: The model was fine-tuned on a custom educational dataset. This dataset includes various types of queries that are labeled based on their content as either "educational" or "non-educational."
32
+ - **Dataset Source**: The dataset was manually curated to include a balanced mix of educational questions (covering various academic subjects) and non-educational questions (general queries that do not pertain to educational content).
33
+
34
+ ## Training Procedure
35
+
36
+ - **Framework**: The model was trained using the [Hugging Face Transformers library](https://huggingface.co/transformers/) with PyTorch.
37
+ - **Fine-Tuning Parameters**:
38
+ - **Batch Size**: 16
39
+ - **Learning Rate**: 5e-5
40
+ - **Epochs**: 3
41
+ - **Optimizer**: AdamW with weight decay
42
+ - **Hardware**: Fine-tuning was performed on a single NVIDIA V100 GPU.
43
+
44
+ ## Limitations and Bias
45
+
46
+ While this model has been fine-tuned for classifying queries as educational or non-educational, there are some limitations and potential biases:
47
+
48
+ - **Bias in Data**: The model may reflect any biases present in the training data, particularly if certain topics or types of educational content are overrepresented or underrepresented.
49
+ - **Binary Classification**: The model categorizes inputs strictly as "educational" or "non-educational." It may not handle nuanced or ambiguous queries effectively.
50
+ - **Not Suitable for Other Classifications**: This model is specifically designed for educational vs. non-educational classification. It may not perform well on other types of classification tasks without further fine-tuning.
51
+
52
+ ## How to Use
53
+
54
+ You can load the model using the Hugging Face Transformers library:
55
+
56
+ ```python
57
+ from transformers import AutoTokenizer, AutoModelForSequenceClassification
58
+
59
+ tokenizer = AutoTokenizer.from_pretrained("haider0941/distilbert-base-educationl")
60
+ model = AutoModelForSequenceClassification.from_pretrained("haider0941/distilbert-base-educationl")
61
+
62
+ input_text = "What is the capital of France?"
63
+ inputs = tokenizer(input_text, return_tensors="pt")
64
+ outputs = model(**inputs)
65
+ ```
66
+
67
+ ## Citation
68
+
69
+ If you use this model, please cite it as follows:
70
+
71
+ ```
72
+ @misc{your-username_2024,
73
+ title={Fine-Tuned DistilBERT for Educational Query Classification},
74
+ author={Haider0941},
75
+ year={2024},
76
+ howpublished={\url{https://huggingface.co/haider0941/distilbert-base-educationl}},
77
+ }
78
+ ```