--- datasets: - mozilla-foundation/common_voice_13_0 metrics: - wer pipeline_tag: summarization --- # Model Card for Model ID This modelcard aims to be a base template for new models. It has been generated using [this raw template](https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/templates/modelcard_template.md?plain=1). ## Model Details ### Model Description ```python class WhisperCTC(nn.Module): def __init__( self, encoder_id: str = "tuanio/whisper-encoder.tiny.en", dropout: float = 0.1, vocab_size: int = 47, ): super().__init__() self.encoder = WhisperEncoder.from_pretrained(encoder_id) print("Freezing Whisper Encoder...") self.encoder._freeze_parameters() print("Freezed!") self.lm_head = nn.Sequential( nn.SiLU(), nn.Dropout(dropout), nn.Linear(self.encoder.config.d_model, vocab_size), ) nn.init.kaiming_uniform_( self.lm_head[-1].weight, mode="fan_in", nonlinearity="relu" ) def forward(self, feat: Tensor, attn_mask: Tensor): enc = self.encoder( input_features=feat, attention_mask=attn_mask ).last_hidden_state logits = self.lm_head(enc) log_probs = nn.functional.log_softmax(logits, dim=-1) return log_probs ``` - **Developed by:** [More Information Needed] - **Shared by [optional]:** [More Information Needed] - **Model type:** [More Information Needed] - **Language(s) (NLP):** [More Information Needed] - **License:** [More Information Needed] - **Finetuned from model [optional]:** [More Information Needed] ### Model Sources [optional] - **Repository:** [More Information Needed] - **Paper [optional]:** [More Information Needed] - **Demo [optional]:** [More Information Needed] ## Uses ### Direct Use [More Information Needed] ### Downstream Use [optional] [More Information Needed] ### Out-of-Scope Use [More Information Needed] ## Bias, Risks, and Limitations [More Information Needed] ### Recommendations Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. ## How to Get Started with the Model Use the code below to get started with the model. [More Information Needed] ## Training Details ### Training Data - IndictTTS: https://www.kaggle.com/datasets/tuannguyenvananh/indictts-english [More Information Needed] ### Training Procedure #### Preprocessing [optional] [More Information Needed] #### Training Hyperparameters ```yaml data_cfg: dataset: processor: feat_extractor_id: ${model_cfg.model.encoder_id} tokenizer_id: ${model_cfg.tokenizer_id} path: base: indict_tts: ../IndicTTS cv: ../ train: - train_data/indict_tts_train.jsonl # - train_data/cv_train.jsonl test: - train_data/indict_tts_test.jsonl # - train_data/cv_test.jsonl dev: - train_data/indict_tts_dev.jsonl # - train_data/cv_dev.jsonl dataloader: batch_size: 46 num_workers: 8 pin_memory: True model_cfg: tokenizer_id: tuanio/wav2vec2-phoneme-ipa-ctc model: dropout: 0.1 encoder_id: tuanio/whisper-encoder.medium.en optim: lr: 1.25e-05 betas: [0.9, 0.998] weight_decay: 0.01 scheduler: name: linear total_steps: -1 warmup_ratio: 0.05 interval: step frequency: 1 trainer_cfg: log: wandb: True logger_wandb: project: aped_indian-lish name: whisper-medium-indict-tts-only-from-epoch1 log_model: all arguments: accelerator: gpu devices: -1 max_epochs: 10 log_every_n_steps: 1 enable_checkpointing: True accumulate_grad_batches: 2 inference_mode: True gradient_clip_val: 5.0 check_val_every_n_epoch: 1 val_check_interval: null experiment_cfg: train: True valid: True test: True ckpt: resume_ckpt: True ckpt_path: ckpt/medium.epoch3.ckpt ``` #### Speeds, Sizes, Times [optional] [More Information Needed] ## Evaluation ### Testing Data, Factors & Metrics #### Testing Data [More Information Needed] #### Factors [More Information Needed] #### Metrics [More Information Needed] ### Results [More Information Needed] #### Summary ## Model Examination [optional] [More Information Needed] ## Environmental Impact Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). - **Hardware Type:** [More Information Needed] - **Hours used:** [More Information Needed] - **Cloud Provider:** [More Information Needed] - **Compute Region:** [More Information Needed] - **Carbon Emitted:** [More Information Needed] ## Technical Specifications [optional] ### Model Architecture and Objective [More Information Needed] ### Compute Infrastructure [More Information Needed] #### Hardware [More Information Needed] #### Software [More Information Needed] ## Citation [optional] **BibTeX:** [More Information Needed] **APA:** [More Information Needed] ## Glossary [optional] [More Information Needed] ## More Information [optional] [More Information Needed] ## Model Card Authors [optional] [More Information Needed] ## Model Card Contact [More Information Needed]