File size: 5,745 Bytes
76bc7aa
 
1de1ab5
79c9d6a
1de1ab5
 
 
 
45aec90
 
1de1ab5
 
79c9d6a
 
8e8d262
79c9d6a
9240bbb
79c9d6a
8e8d262
 
 
 
45aec90
 
 
 
ffe9372
 
ca23783
 
2b9a328
ffe9372
7dbef46
ffe9372
 
 
 
 
 
 
089769d
ffe9372
a173146
 
072ed95
be33808
428c3f9
0e5db68
 
 
a622e1b
0e5db68
be33808
0e5db68
be33808
0e5db68
 
428c3f9
be33808
ffe9372
a173146
 
ca23783
 
2b9a328
ffe9372
be33808
ffe9372
 
 
0b29a23
ffe9372
7dbef46
ffe9372
10f04a1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
---
license: apache-2.0
datasets:
- sekarmulyani/ulasan-beauty-products
language:
- id
metrics:
- f1
- roc_auc
- accuracy
pipeline_tag: text-classification
widget:
- text: mengecewakan, sebulan baru sampai, botolnya pecah
  example_title: Contoh 1
- text: kulit saya jadi kering dan bruntusan
  example_title: Contoh 2
- text: barang sudah datang tapi kurirnya lemot
  example_title: Contoh 3
- text: lipbalm oke, sesuai harga
  example_title: Contoh 4
- text: recommended buat yang pengen keliatan cantik
  example_title: Contoh 5
library_name: transformers
tags:
- rating classification
- e-commerce
---

# E-Commerce Rating's Review Classification: Women's Beauty Product

## en:

The project of this pretrained_model involves a series of complex steps that begin with leveraging the [IndoBERTweet](https://huggingface.co/indolem/indobertweet-base-uncased) pretrained model.

The primary objective behind the development of this model is to address the bias frequently encountered in product reviews on e-commerce platforms. One of the classic issues on such platforms is the disconnect between the language used in reviews and the ratings given by users. This model was conceived with a focus on mitigating this problem.

However, during its developmental journey, there are several limitations that need to be noted. Firstly, this model is only empowered by reviews of women's beauty products, potentially limiting its generalizability to specific categories. Additionally, it's important to remember that the dataset employed for training this model was generated through scraping techniques. While efficient, this technique also comes with the risk of introducing potential biases.

The final output of this model, although holding substantial potential to enhance the quality of reviews on e-commerce platforms, still requires critical review. In-depth evaluation is necessary to comprehend the extent to which this model succeeds in addressing bias issues in product reviews. Understanding the limitations and potential impact of the scraped dataset is also vital in interpreting the outputs of this model.

> This project is oriented towards academic pursuits and is undertaken as a stipulated requirement for graduation within the Information System undergraduate program at Computer Science Faculty, Amikom University of Purwokerto.

---

## Training Data (WIP)
<small><em>This model comes from **Epoch 4** Checkpoint</em></small>

| **Epoch** | **Training Loss** | **Validation Loss** | **F1**     | **Roc Auc**  | **Validation Acc** | **Test Acc** |
|----------|-------------------|---------------------|-----------|------------|-------------------|--------------|
| 1        | 0.374800          | 0.374789            | 0.438253  | 0.643794   | 0.344436          | 51.56%       |
| 2        | 0.346300          | 0.367311            | 0.469088  | 0.660424   | 0.384696          | 52.22%       |
| 3        | 0.311500          | 0.386395            | 0.480959  | 0.669563   | 0.423579          | 51.25%       |
| **4**        | **0.261800**          | **0.431841**            | **0.496517**  | **0.680931**   | **0.458986**          | **51.27%**       |
| 5        | 0.222300          | 0.478353            | 0.495308  | 0.681398   | 0.468297          | 50.43%       |
| 6        | 0.198800          | 0.536330            | 0.496174  | 0.682431   | 0.473149          | 50.69%       |
| 7        | 0.166200          | 0.608345            | 0.492919  | 0.680791   | 0.472166          | 49.78%       |
| 8        | 0.142400          | 0.651709            | 0.496586  | 0.683545   | 0.480428          | 50.07%       |

<small>Note: Low accuracy might be attributed to the presence of reviewer bias in the validation and testing dataset.</small>

---

# Klasifikasi Rating Ulasan E-Commerce: Produk Kecantikan Wanita

## id:

Proyek pembuatan pretrained_model ini melibatkan serangkaian langkah yang dimulai dengan pemanfaatan pretrained model [IndoBERTweet](https://huggingface.co/indolem/indobertweet-base-uncased).

Tujuan utama di balik pengembangan model ini adalah mengatasi bias yang sering muncul dalam ulasan produk di platform e-commerce. Salah satu masalah klasik di platform semacam ini adalah ketidaksesuaian antara kata-kata dalam ulasan dan peringkat yang diberikan oleh pengguna. Model ini diciptakan dengan fokus pada penyelesaian masalah ini.

Namun dalam pengembangannya, terdapat sejumlah keterbatasan yang perlu diperhatikan. Pertama, model ini hanya di finetune pada ulasan produk kecantikan wanita, sehingga kemampuannya hanya terbatas pada kategori ini. Selain itu dataset yang digunakan dalam pelatihan model ini dihasilkan melalui teknik scraping. Meskipun teknik ini efisien, namun juga memiliki potensi bias rating ulasan apabila tidak dilakukan supervisi.

Hasil akhir dari model ini, meskipun memiliki potensi besar untuk meningkatkan kualitas ulasan di platform e-commerce masih perlu ditinjau secara kritis. Evaluasi mendalam diperlukan untuk memahami sejauh mana model ini berhasil mengatasi masalah bias dalam ulasan produk. Pemahaman tentang batasan dan potensi dampak dari dataset scraping juga penting dalam mengartikan keluaran dari model ini.

> Proyek ini ditujukan untuk pencapaian akademis dan dilakukan sebagai persyaratan untuk meraih gelar sarjana dalam Program Studi Sistem Informasi Fakultas Ilmu Komputer di Universitas Amikom Purwokerto.

---

# BibTex
```
@misc {sekar_mulyani_2023,
	author       = { {Sekar Mulyani} },
	title        = { indobertweet-ulasan-beauty-products (Revision 7dbef46) },
	year         = 2023,
	url          = { https://huggingface.co/sekarmulyani/indobertweet-ulasan-beauty-products },
	doi          = { 10.57967/hf/1033 },
	publisher    = { Hugging Face }
}
```