molinari135's picture
Initial commit
a1a7d89

Model Card for Product Return Prediction

model details

  • person or organization developing model: team product-return-prediction
  • model date: 24/11/2024
  • model version: v1.4
  • model type: Support Vector Machine

This model is a Support Vector Machine classifier designed to predict whether a product will be returned or not, based on various product and transaction features. Hyperparameters (C, kernel type and gamma) are chosen using a grid search, with a 10-fold cross validation.

intended use

primary intended uses

The purpose of the model is to assist e-commerce owners (Armani) in identifying possible returns among their purchases in order to reorganize inventories to optimize product handling and transportation costs

primary intended users

The model was developed for Armani. Specifically, the purpose is to support professional figures involved in logistics, product management, and marketing

factors

relevant factors

Some factors to be considered that involve the model are the following:

  • product features: characteristics like model, fabric, colour, composition, and product category may have a significant impact on the likelihood of a product being returned
  • imbalanced classes: the class imbalance is a relevant factor that may affect the model's ability to predict the minority class (returns) accurately

decision thresholds

The default decision threshold for the SVM model is 0.5, where probabilities greater than or equal to 0.5 indicate a "returned" prediction, and probabilities below 0.5 indicate "not returned."

Train and Test data

dataset description

  • dataset: German Sales 2023 EA

the model was trained and tested on this dataset, following appropriate splitting and pre-processing steps.

split

Dataset splitting is as follows:

  • training: 80%
  • validation and test: 20%

the splitting is performed by using the corresponding sklearn function. The chosen random state is 42.

pre-processing

To be adapted to the binary classification task, and further adapted to a numerical model such as SVM, the model underwent an important pre-processing phase. Pre-processing steps are the following:

  1. Dataset conversion from Excel to TSV
  2. Specific columns removal from dataframe
  3. Train and test data splitting
  4. Train and save scaler
  5. Scaling data with a pre-trained scaler
  6. Target encoding of categorical columns
  7. Preparation of inventory with sales data
  8. Population of missing values
  9. Calculation and application of return percentages by color
  10. Final cleaning and processing

Quantitative analysis

PRECISION RECALL F1-SCORE Support
No return 0.95 0.95 0.95 2086
Return 0.89 0.90 0.89 960
Accuracy 0.93