About

This model was created to support experiments for evaluating phonetic transcription with the Buckeye corpus as part of https://github.com/ginic/multipa/tree/buckeye_experiments. This is a version of facebook/wav2vec2-large-xlsr-53 fine tuned on a very specific subset of the Buckeye corpus. For details about specific model parameters, please view the config.json here or training scripts in scripts/buckeye_experiments on the buckeye_experiments branch of the GitHub repository.

Experiment Details

Vary the random seed to select training data while keeping an even 50/50 gender split to measure statistical significance of changing training data selection. Retrain with the same model parameters, but different data seeding to measure statistical significance of data seed, keeping 50/50 gender split.

Goals:

Establish whether data variation with the same gender makeup is statistically significant in changing performance on the test set

Params to vary:

training data seed (--train_seed): [7 (default), 91, 15, 139, 503]

ginic
/

data_seed_4_wav2vec2-large-xlsr-buckeye-ipa

About

Experiment Details

Model tree for ginic/data_seed_4_wav2vec2-large-xlsr-buckeye-ipa

Space using ginic/data_seed_4_wav2vec2-large-xlsr-buckeye-ipa 1