fahrendrakhoirul's picture
Upload tokenizer
00b20e6 verified
metadata
language:
  - id
library_name: transformers
tags:
  - indobert
  - indonlu
  - indobenchmark
datasets:
  - fahrendrakhoirul/ecommerce-reviews-multilabel-dataset
metrics:
  - f1
  - precision
  - recall

This model leverages IndoBERT for understanding language and a Long Short-Term Memory (LSTM) network to capture sequential information in customer reviews. It's designed for multi-label classification of e-commerce reviews, focusing on:

  • Produk (Product): Customer satisfaction with product quality, performance, and description accuracy.
  • Layanan Pelanggan (Customer Service): Interaction with sellers, their responsiveness, and complaint handling.
  • Pengiriman (Shipping/Delivery): Speed of delivery, item condition upon arrival, and timeliness.

How to import in PyTorch:

import torch.nn as nn
from huggingface_hub import PyTorchModelHubMixin
from transformers import BertModel, AutoTokenizer

class IndoBertLSTMEcommerceReview(nn.Module, PyTorchModelHubMixin):
    def __init__(self, bert):
      super().__init__()
      self.bert = bert
      self.lstm = nn.LSTM(bert.config.hidden_size, 128)
      self.linear = nn.Linear(128, 3)
      self.sigmoid = nn.Sigmoid()

    def forward(self, input_ids, attention_mask):
      outputs = self.bert(input_ids=input_ids, attention_mask=attention_mask)
      last_hidden_state = outputs.last_hidden_state
      lstm_out, _ = self.lstm(last_hidden_state)
      pooled = lstm_out[:, -1, :]
      logits = self.linear(pooled)
      probabilities = self.sigmoid(logits)
      return probabilities

bert = BertModel.from_pretrained("indobenchmark/indobert-base-p1")
tokenizer = AutoTokenizer.from_pretrained("fahrendrakhoirul/indobert-lstm-finetuned-ecommerce-reviews")
model = IndoBertLSTMEcommerceReview.from_pretrained("fahrendrakhoirul/indobert-lstm-finetuned-ecommerce-reviews", bert=bert)