About
This model was created to support experiments for evaluating phonetic transcription
with the Buckeye corpus as part of https://github.com/ginic/multipa/tree/buckeye_experiments.
This is a version of facebook/wav2vec2-large-xlsr-53 fine tuned on a very specific subset of the Buckeye corpus.
For details about specific model parameters, please view the config.json here or
training scripts in scripts/buckeye_experiments on the buckeye_experiments
branch of the GitHub repository.
Experiment Details
Vary the random seed to select training data while keeping an even 50/50 gender split to measure statistical significance of changing training data selection. Retrain with the same model parameters, but different data seeding to measure statistical significance of data seed, keeping 50/50 gender split.
Goals:
- Establish whether data variation with the same gender makeup is statistically significant in changing performance on the test set
Params to vary:
- training data seed (--train_seed): [7 (default), 91, 15, 139, 503]
- Downloads last month
- 164