---
library_name: sklearn
tags:
- sklearn
- skops
- tabular-regression
model_format: skops
model_file: model.skops
widget:
structuredData:
AveBedrms:
- 0.9290780141843972
- 0.9458483754512635
- 1.087360594795539
AveOccup:
- 3.1134751773049647
- 3.0613718411552346
- 3.2657992565055762
AveRooms:
- 6.304964539007092
- 6.945848375451264
- 3.8884758364312266
HouseAge:
- 17.0
- 15.0
- 24.0
Latitude:
- 34.23
- 36.84
- 34.04
Longitude:
- -117.41
- -119.77
- -118.3
MedInc:
- 6.1426
- 5.3886
- 1.7109
Population:
- 439.0
- 848.0
- 1757.0
---
# Model description
Gradient boosting regressor trained on California Housing dataset
The model is a gradient boosting regressor from sklearn. On top of the standard
features, it contains predictions from a KNN models. These predictions are calculated
out of fold, then added on top of the existing features. These features are really
helpful for decision tree-based models, since those cannot easily learn from geospatial
data.
## Intended uses & limitations
This model is meant for demonstration purposes
## Training Procedure
### Hyperparameters
The model is trained with below hyperparameters.
Click to expand
| Hyperparameter | Value |
|-----------------------------------------------|--------------------------------------------------------------|
| cv | |
| estimators | [('knn@5', Pipeline(steps=[('select_cols',
ColumnTransformer(transformers=[('long_and_lat', 'passthrough',
['Longitude', 'Latitude'])])),
('knn', KNeighborsRegressor())]))] |
| final_estimator__alpha | 0.9 |
| final_estimator__ccp_alpha | 0.0 |
| final_estimator__criterion | friedman_mse |
| final_estimator__init | |
| final_estimator__learning_rate | 0.1 |
| final_estimator__loss | squared_error |
| final_estimator__max_depth | 3 |
| final_estimator__max_features | |
| final_estimator__max_leaf_nodes | |
| final_estimator__min_impurity_decrease | 0.0 |
| final_estimator__min_samples_leaf | 1 |
| final_estimator__min_samples_split | 2 |
| final_estimator__min_weight_fraction_leaf | 0.0 |
| final_estimator__n_estimators | 500 |
| final_estimator__n_iter_no_change | |
| final_estimator__random_state | 0 |
| final_estimator__subsample | 1.0 |
| final_estimator__tol | 0.0001 |
| final_estimator__validation_fraction | 0.1 |
| final_estimator__verbose | 0 |
| final_estimator__warm_start | False |
| final_estimator | GradientBoostingRegressor(n_estimators=500, random_state=0) |
| n_jobs | |
| passthrough | True |
| verbose | 0 |
| knn@5 | Pipeline(steps=[('select_cols',
ColumnTransformer(transformers=[('long_and_lat', 'passthrough',
['Longitude', 'Latitude'])])),
('knn', KNeighborsRegressor())]) |
| knn@5__memory | |
| knn@5__steps | [('select_cols', ColumnTransformer(transformers=[('long_and_lat', 'passthrough',
['Longitude', 'Latitude'])])), ('knn', KNeighborsRegressor())] |
| knn@5__verbose | False |
| knn@5__select_cols | ColumnTransformer(transformers=[('long_and_lat', 'passthrough',
['Longitude', 'Latitude'])]) |
| knn@5__knn | KNeighborsRegressor() |
| knn@5__select_cols__n_jobs | |
| knn@5__select_cols__remainder | drop |
| knn@5__select_cols__sparse_threshold | 0.3 |
| knn@5__select_cols__transformer_weights | |
| knn@5__select_cols__transformers | [('long_and_lat', 'passthrough', ['Longitude', 'Latitude'])] |
| knn@5__select_cols__verbose | False |
| knn@5__select_cols__verbose_feature_names_out | True |
| knn@5__select_cols__long_and_lat | passthrough |
| knn@5__knn__algorithm | auto |
| knn@5__knn__leaf_size | 30 |
| knn@5__knn__metric | minkowski |
| knn@5__knn__metric_params | |
| knn@5__knn__n_jobs | |
| knn@5__knn__n_neighbors | 5 |
| knn@5__knn__p | 2 |
| knn@5__knn__weights | uniform |
StackingRegressor(estimators=[('knn@5',Pipeline(steps=[('select_cols',ColumnTransformer(transformers=[('long_and_lat','passthrough',['Longitude','Latitude'])])),('knn',KNeighborsRegressor())]))],final_estimator=GradientBoostingRegressor(n_estimators=500,random_state=0),passthrough=True)In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
StackingRegressor(estimators=[('knn@5',Pipeline(steps=[('select_cols',ColumnTransformer(transformers=[('long_and_lat','passthrough',['Longitude','Latitude'])])),('knn',KNeighborsRegressor())]))],final_estimator=GradientBoostingRegressor(n_estimators=500,random_state=0),passthrough=True)
ColumnTransformer(transformers=[('long_and_lat', 'passthrough',['Longitude', 'Latitude'])])
['Longitude', 'Latitude']
passthrough
KNeighborsRegressor()
GradientBoostingRegressor(n_estimators=500, random_state=0)