--- license: apache-2.0 datasets: - sekarmulyani/ulasan_makeup language: - id metrics: - f1 pipeline_tag: text-classification widget: - text: "mengecewakan, sebulan baru sampai, botolnya pecah" example_title: "Contoh 1" - text: "warnanya kok beda ya, pesen cokelat yang dateng merah" example_title: "Contoh 2" - text: "terimakasih, barang sudah datang tapi kurirnya lemot" example_title: "Contoh 3" --- # E-Commerce Rating's Review Classification: Women's Beauty Product ## en: The project of this pretrained_model involves a series of complex steps that begin with leveraging the [IndoBERTweet](https://huggingface.co/indolem/indobertweet-base-uncased) pretrained model. Through a meticulous fine-tuning process, this model has been enhanced by utilizing the optimizer.pt from the aforementioned pretrained model. The primary objective behind the development of this model is to address the bias frequently encountered in product reviews on e-commerce platforms. One of the classic issues on such platforms is the disconnect between the language used in reviews and the ratings given by users. This model was conceived with a focus on mitigating this problem. However, during its developmental journey, there are several limitations that need to be noted. Firstly, this model is only empowered by reviews of women's beauty products, potentially limiting its generalizability to specific categories. Additionally, it's important to remember that the dataset employed for training this model was generated through scraping techniques. While efficient, this technique also comes with the risk of introducing potential biases. The final output of this model, although holding substantial potential to enhance the quality of reviews on e-commerce platforms, still requires critical review. In-depth evaluation is necessary to comprehend the extent to which this model succeeds in addressing bias issues in product reviews. Understanding the limitations and potential impact of the scraped dataset is also vital in interpreting the outputs of this model. > This project is oriented towards academic pursuits and is undertaken as a stipulated requirement for graduation within the Information System undergraduate program at Computer Science Faculty, Amikom University of Purwokerto. --- ## Training Data This model is from Epoch 6 Checkpoint | Epoch | Training Loss | Validation Loss | F1 | Roc Auc | Accuracy | |-------|---------------|-----------------|----------|----------|----------| | 1 | 0.371500 | 0.368949 | 0.412707 | 0.630868 | 0.304636 | | 2 | 0.349400 | 0.368781 | 0.457155 | 0.653646 | 0.366337 | | 3 | 0.322000 | 0.379917 | 0.488654 | 0.673030 | 0.422530 | | 4 | 0.286300 | 0.412161 | 0.495117 | 0.679087 | 0.448102 | | 5 | 0.248400 | 0.451065 | 0.488383 | 0.676833 | 0.455970 | | 6 | 0.213400 | 0.496999 | 0.494043 | 0.680316 | 0.462986 | | 7 | 0.179300 | 0.540976 | 0.489517 | 0.677931 | 0.461609 | | 8 | 0.156000 | 0.599047 | 0.486695 | 0.677005 | 0.464297 | Note: The lower accuracy might be attributed to the presence of reviewer bias in the testing dataset. --- # Klasifikasi Rating Ulasan E-Commerce: Produk Kecantikan Wanita ## id: Proyek pretrained_model ini melibatkan serangkaian langkah kompleks yang dimulai dengan pemanfaatan pretrained model [IndoBERTweet](https://huggingface.co/indolem/indobertweet-base-uncased). Melalui proses fine-tuning yang cermat, model ini berhasil disempurnakan dengan memanfaatkan optimizer.pt yang ada dalam pretrained model tersebut. Tujuan utama di balik pengembangan model ini adalah mengatasi bias yang sering muncul dalam ulasan produk di platform e-commerce. Salah satu masalah klasik di platform semacam ini adalah ketidaksesuaian antara kata-kata dalam ulasan dan peringkat yang diberikan oleh pengguna. Model ini diciptakan dengan fokus pada penyelesaian masalah ini. Namun, dalam perjalanan pengembangannya, terdapat sejumlah keterbatasan yang perlu diperhatikan. Pertama, model ini hanya diberdayakan oleh ulasan produk kecantikan wanita, sehingga kemampuannya untuk umum mungkin terbatas pada kategori tertentu. Selain itu dataset yang digunakan dalam pelatihan model ini dihasilkan melalui teknik scraping. Meskipun teknik ini efisien, namun juga berisiko terkena potensi bias. Hasil akhir dari model ini, meskipun memiliki potensi besar untuk meningkatkan kualitas ulasan di platform e-commerce, masih perlu ditinjau secara kritis. Evaluasi mendalam diperlukan untuk memahami sejauh mana model ini berhasil mengatasi masalah bias dalam ulasan produk. Pemahaman tentang batasan dan potensi dampak dari dataset scraping juga penting dalam mengartikan keluaran dari model ini. > Proyek ini ditujukan untuk pencapaian akademis dan dilakukan sebagai persyaratan untuk meraih gelar sarjana dalam Program Studi Sistem Informasi Fakultas Ilmu Komputer di Universitas Amikom Purwokerto. > > --- > > ## BibTex: >>
@misc {sekar_mulyani_2023, author = { {Sekar Mulyani} }, title = { ulasan-beauty-products (Revision b8202dc) }, year = 2023, url = { https://huggingface.co/datasets/sekarmulyani/ulasan-beauty-products }, doi = { 10.57967/hf/1028 }, publisher = { Hugging Face } }