metadata

license: mit

Enrollment Prediction Machine Learning Model This repository contains a machine learning model for predicting student enrollment based on a public dataset obtained from Kaggle. The dataset contains various features related to student demographics, academic performance, and economic factors.

Dataset The dataset consists of 34 columns and 4,882 rows. Each row represents a student and contains various features such as Marital status, Application mode, Application order, Course, Daytime/evening attendance, Previous qualification, Nacionality, Mother's qualification, Father's qualification, Mother's occupation, Father's occupation, Displaced, Educational special needs, Debtor, Tuition fees up to date, Gender, Scholarship holder, Age at enrollment, International, Curricular units 1st sem (credited), Curricular units 1st sem (enrolled), Curricular units 1st sem (evaluations), Curricular units 1st sem (approved), Curricular units 1st sem (grade), Curricular units 1st sem (without evaluations), Curricular units 2nd sem (credited), Curricular units 2nd sem (enrolled), Curricular units 2nd sem (evaluations), Curricular units 2nd sem (approved), Curricular units 2nd sem (grade), Curricular units 2nd sem (without evaluations), Unemployment rate, Inflation rate, and GDP.

The target column is "Target", which indicates whether a student dropped out or graduated.

The dataset can be found on Kaggle: https://www.kaggle.com/datasets/thedevastator/higher-education-predictors-of-student-retention

Model The machine learning model uses a decision tree algorithm to predict student enrollment. The model has been trained on the dataset using 80% of the data for training and 20% for testing. The accuracy of the model is 85%.

Files

This repository contains the following files:

enrollment_prediction_model.ipynb: Jupyter notebook containing the code for training and testing the model

enrollment_prediction_model.pkl: Serialized machine learning model file

enrollment_prediction_model_readme.md: Readme file containing information about the machine learning model

Usage

To use the machine learning model, follow these steps:

Clone the repository

Install the required packages (pandas, numpy, scikit-learn)

Load the serialized machine learning model from the enrollment_prediction_model.pkl file

Prepare a new dataset with the same columns as the original dataset

Use the predict function of the model to predict enrollment for each row in the new dataset

Example code:

import pandas as pd

import pickle

Load serialized machine learning model

with open('enrollment_prediction_model.pkl', 'rb') as file:

model = pickle.load(file)

Prepare new dataset

new_data = pd.read_csv('new_data.csv')

Predict enrollment

predictions = model.predict(new_data)