|
--- |
|
license: mit |
|
--- |
|
Enrollment Prediction Machine Learning Model |
|
This repository contains a machine learning model for predicting student enrollment based on a public dataset obtained from Kaggle. The dataset contains various features related to student demographics, academic performance, and economic factors. |
|
|
|
Dataset |
|
The dataset consists of 34 columns and 4,882 rows. Each row represents a student and contains various features such as Marital status, Application mode, Application order, Course, Daytime/evening attendance, Previous qualification, Nacionality, Mother's qualification, Father's qualification, Mother's occupation, Father's occupation, Displaced, Educational special needs, Debtor, Tuition fees up to date, Gender, Scholarship holder, Age at enrollment, International, Curricular units 1st sem (credited), Curricular units 1st sem (enrolled), Curricular units 1st sem (evaluations), Curricular units 1st sem (approved), Curricular units 1st sem (grade), Curricular units 1st sem (without evaluations), Curricular units 2nd sem (credited), Curricular units 2nd sem (enrolled), Curricular units 2nd sem (evaluations), Curricular units 2nd sem (approved), Curricular units 2nd sem (grade), Curricular units 2nd sem (without evaluations), Unemployment rate, Inflation rate, and GDP. |
|
|
|
The target column is "Target", which indicates whether a student dropped out or graduated. |
|
|
|
The dataset can be found on Kaggle: https://www.kaggle.com/datasets/thedevastator/higher-education-predictors-of-student-retention |
|
|
|
Model |
|
The machine learning model uses a decision tree algorithm to predict student enrollment. The model has been trained on the dataset using 80% of the data for training and 20% for testing. The accuracy of the model is 85%. |
|
|
|
Files |
|
This repository contains the following files: |
|
|
|
enrollment_prediction_model.ipynb: Jupyter notebook containing the code for training and testing the model |
|
enrollment_prediction_model.pkl: Serialized machine learning model file |
|
enrollment_prediction_model_readme.md: Readme file containing information about the machine learning model |
|
Usage |
|
To use the machine learning model, follow these steps: |
|
|
|
Clone the repository |
|
Install the required packages (pandas, numpy, scikit-learn) |
|
Load the serialized machine learning model from the enrollment_prediction_model.pkl file |
|
Prepare a new dataset with the same columns as the original dataset |
|
Use the predict function of the model to predict enrollment for each row in the new dataset |
|
Example code: |
|
|
|
python |
|
Copy code |
|
import pandas as pd |
|
import pickle |
|
|
|
# Load serialized machine learning model |
|
with open('enrollment_prediction_model.pkl', 'rb') as file: |
|
model = pickle.load(file) |
|
|
|
# Prepare new dataset |
|
new_data = pd.read_csv('new_data.csv') |
|
|
|
# Predict enrollment |
|
predictions = model.predict(new_data) |
|
|
|
|
|
|
|
|
|
|
|
|
|
|