verbatims-unvalid / README.md
huynhdoo's picture
pushing model SVC with camember base embeddings
fb9d320 verified
metadata
library_name: sklearn
license: mit
tags:
  - sklearn
  - skops
  - text-classification
model_format: pickle
model_file: skops-zquiq5g5.pkl

Model description

This is a Support Vector Classifier model trained on SIRIUS dataset.As input, the model takes text embeddings encoded with camembert-base (768 tokens)

Intended uses & limitations

This model is not ready to be used in production.

Training Procedure

[More Information Needed]

Hyperparameters

Click to expand
Hyperparameter Value
memory
steps [('columntransformer', ColumnTransformer(transformers=[('num',
Pipeline(steps=[('imputer',
SimpleImputer(strategy='median')),
('scaler', StandardScaler()),
('pca',
PCA(n_components=84))]),
Index(['avg_1', 'avg_2', 'avg_3', 'avg_4', 'avg_5', 'avg_6', 'avg_7', 'avg_8',
'avg_9', 'avg_10',
...
'max_759', 'max_760', 'max_761', 'max_762', 'max_763', 'max_764',
'max_765', 'max_766', 'max_767', 'max_768'],
dtype='object', length=2304))],
verbose_feature_names_out=False)), ('svc', SVC(probability=True, random_state=42))]
verbose False
columntransformer ColumnTransformer(transformers=[('num',
Pipeline(steps=[('imputer',
SimpleImputer(strategy='median')),
('scaler', StandardScaler()),
('pca',
PCA(n_components=84))]),
Index(['avg_1', 'avg_2', 'avg_3', 'avg_4', 'avg_5', 'avg_6', 'avg_7', 'avg_8',
'avg_9', 'avg_10',
...
'max_759', 'max_760', 'max_761', 'max_762', 'max_763', 'max_764',
'max_765', 'max_766', 'max_767', 'max_768'],
dtype='object', length=2304))],
verbose_feature_names_out=False)
svc SVC(probability=True, random_state=42)
columntransformer__force_int_remainder_cols True
columntransformer__n_jobs
columntransformer__remainder drop
columntransformer__sparse_threshold 0.3
columntransformer__transformer_weights
columntransformer__transformers [('num', Pipeline(steps=[('imputer', SimpleImputer(strategy='median')),
('scaler', StandardScaler()), ('pca', PCA(n_components=84))]), Index(['avg_1', 'avg_2', 'avg_3', 'avg_4', 'avg_5', 'avg_6', 'avg_7', 'avg_8',
'avg_9', 'avg_10',
...
'max_759', 'max_760', 'max_761', 'max_762', 'max_763', 'max_764',
'max_765', 'max_766', 'max_767', 'max_768'],
dtype='object', length=2304))]
columntransformer__verbose False
columntransformer__verbose_feature_names_out False
columntransformer__num Pipeline(steps=[('imputer', SimpleImputer(strategy='median')),
('scaler', StandardScaler()), ('pca', PCA(n_components=84))])
columntransformer__num__memory
columntransformer__num__steps [('imputer', SimpleImputer(strategy='median')), ('scaler', StandardScaler()), ('pca', PCA(n_components=84))]
columntransformer__num__verbose False
columntransformer__num__imputer SimpleImputer(strategy='median')
columntransformer__num__scaler StandardScaler()
columntransformer__num__pca PCA(n_components=84)
columntransformer__num__imputer__add_indicator False
columntransformer__num__imputer__copy True
columntransformer__num__imputer__fill_value
columntransformer__num__imputer__keep_empty_features False
columntransformer__num__imputer__missing_values nan
columntransformer__num__imputer__strategy median
columntransformer__num__scaler__copy True
columntransformer__num__scaler__with_mean True
columntransformer__num__scaler__with_std True
columntransformer__num__pca__copy True
columntransformer__num__pca__iterated_power auto
columntransformer__num__pca__n_components 84
columntransformer__num__pca__n_oversamples 10
columntransformer__num__pca__power_iteration_normalizer auto
columntransformer__num__pca__random_state
columntransformer__num__pca__svd_solver auto
columntransformer__num__pca__tol 0.0
columntransformer__num__pca__whiten False
svc__C 1.0
svc__break_ties False
svc__cache_size 200
svc__class_weight
svc__coef0 0.0
svc__decision_function_shape ovr
svc__degree 3
svc__gamma scale
svc__kernel rbf
svc__max_iter -1
svc__probability True
svc__random_state 42
svc__shrinking True
svc__tol 0.001
svc__verbose False

Model Plot

Pipeline(steps=[('columntransformer',ColumnTransformer(transformers=[('num',Pipeline(steps=[('imputer',SimpleImputer(strategy='median')),('scaler',StandardScaler()),('pca',PCA(n_components=84))]),Index(['avg_1', 'avg_2', 'avg_3', 'avg_4', 'avg_5', 'avg_6', 'avg_7', 'avg_8','avg_9', 'avg_10',...'max_759', 'max_760', 'max_761', 'max_762', 'max_763', 'max_764','max_765', 'max_766', 'max_767', 'max_768'],dtype='object', length=2304))],verbose_feature_names_out=False)),('svc', SVC(probability=True, random_state=42))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.

Evaluation Results

Metric Value
accuracy 0.935065
f1 score 0.935709

Confusion Matrix

Confusion Matrix

How to Get Started with the Model

[More Information Needed]

Model Card Authors

huynhdoo

Model Card Contact

You can contact the model card authors through following channels: [More Information Needed]

Citation

BibTeX

@inproceedings{...,year={2024}}

get_started_code

import pickle as pickle with open(pkl_filename, 'rb') as file: pipe = pickle.load(file)