Cheese Texture Classifier (DistilBERT)
Model Creator: Rumi Loghmani (@rlogh)
Original Dataset: aslan-ng/cheese-text (by Aslan Noorghasemi)
This model performs 4-class texture classification on cheese descriptions using fine-tuned DistilBERT.
Model Description
- Architecture: DistilBERT-base-uncased fine-tuned for sequence classification
- Task: 4-class texture classification (hard, semi-hard, semi-soft, soft)
- Input: Cheese description text (up to 512 tokens)
- Output: 4-class probability distribution
Training Details
Data
- Dataset: aslan-ng/cheese-text (original split: 100 samples)
- Train/Val/Test Split: 70/15/15 (stratified)
- Text Source: Cheese descriptions from the dataset
- Labels: Texture categories (hard, semi-hard, semi-soft, soft)
Preprocessing
- Tokenization: DistilBERT tokenizer with 512 max length
- Padding: Max length padding
- Truncation: Long descriptions truncated to 512 tokens
Training Setup
- Model: distilbert-base-uncased
- Epochs: 10
- Batch Size: 8 (train/val)
- Learning Rate: 2e-5
- Warmup Steps: 10
- Weight Decay: 0.01
- Optimizer: AdamW
- Scheduler: Linear warmup + linear decay
- Mixed Precision: FP16 (if GPU available)
- Seed: 42 (for reproducibility)
Hardware/Compute
- Training Device: CPU
- Training Time: ~5-10 minutes on GPU
- Model Size: ~67M parameters
- Memory Usage: ~2-4GB GPU memory
Performance
- Test Accuracy: 0.400
- Test Loss: 1.290
Class-wise Performance
          precision    recall  f1-score   support
    hard       0.50      0.33      0.40         3
semi-hard 0.29 0.50 0.36 4 semi-soft 0.40 0.50 0.44 4 soft 1.00 0.25 0.40 4
accuracy                           0.40        15
macro avg 0.55 0.40 0.40 15 weighted avg 0.55 0.40 0.40 15
Usage
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
# Load model and tokenizer
model_name = "rlogh/cheese-texture-classifier-distilbert"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
# Example prediction
text = "Feta is a crumbly, tangy Greek cheese with a salty bite and creamy undertones."
inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True, max_length=512)
with torch.no_grad():
    outputs = model(**inputs)
    predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
    predicted_class = torch.argmax(predictions, dim=-1).item()
class_names = ["hard", "semi-hard", "semi-soft", "soft"]
print(f"Predicted texture: {class_names[predicted_class]}")
Class Definitions
- Hard: Firm, aged cheeses that are dense and can be grated (e.g., Parmesan, Cheddar)
- Semi-hard: Moderately firm cheeses with some flexibility (e.g., Gouda, Swiss)
- Semi-soft: Cheeses with some give but maintain shape (e.g., Mozzarella, Blue cheese)
- Soft: Creamy, spreadable cheeses (e.g., Brie, Camembert, Cottage cheese)
Limitations and Ethics
Limitations
- Small Dataset: Trained on only 100 samples, limiting generalization
- Text Quality: Performance depends on description quality and consistency
- Subjective Labels: Texture classification has inherent subjectivity
- Domain Specific: Only applicable to cheese texture classification
- Language: English-only model
Ethical Considerations
- Bias: Model may reflect biases in the original dataset
- Cultural Context: Cheese descriptions may be culturally specific
- Commercial Use: Not intended for commercial cheese production decisions
- Accuracy: Should not be used for critical food safety applications
Recommendations
- Use for educational/research purposes only
- Validate predictions with domain experts
- Consider cultural context when applying to different regions
- Retrain with larger, more diverse datasets for production use
AI Usage Disclosure
This model was developed using:
- Base Model: DistilBERT (distilbert-base-uncased)
- Training Framework: Hugging Face Transformers
- Fine-tuning: Standard BERT fine-tuning techniques
- The AI acted as a collaborative partner throughout the development process, accelerating the coding workflow and providing helpful guidance.
Citation
Model Citation:
@model{rlogh/cheese-texture-classifier-distilbert,
  title={Cheese Texture Classifier (DistilBERT)},
  author={Rumi Loghmani},
  year={2024},
  url={https://huggingface.co/rlogh/cheese-texture-classifier-distilbert}
}
Dataset Citation:
@dataset{aslan-ng/cheese-text,
  title={Cheese Text Dataset},
  author={Aslan Noorghasemi},
  year={2024},
  url={https://huggingface.co/datasets/aslan-ng/cheese-text}
}
License
MIT License - See LICENSE file for details.
- Downloads last month
- 2
Dataset used to train rlogh/cheese-texture-classifier-distilbert
Evaluation results
- Test Accuracy on Cheese Text Datasetself-reported0.400