--- license: cc-by-4.0 language: - bn base_model: - parthiv11/stt_bn_conformer_ctc_large_v2 tags: - speech_recognition - entity_tagging - dialect_prediction - gender - age - intent library_name: nemo --- # Bengali Speech Tagger - Conformer CTC Model This speech tagger performs transcription for Bengali, annotates key entities, predicts speaker age, dialect and intent. ## Model Details - **Model Type**: NeMo ASR - **Architecture**: Conformer CTC - **Language**: Bengali - **Training Data**: AI4Bharat IndicVoices Bengali V1 and V2 dataset - **Task**: Speech Recognition with Entity Tagging ## Usage ```python import nemo.collections.asr as nemo_asr # Load model asr_model = nemo_asr.models.EncDecCTCModel.from_pretrained('WhissleAI/speech-tagger_bn_ctc_meta') # Transcribe audio transcription = asr_model.transcribe(['path/to/audio.wav']) print(transcription[0]) ``` ## Model Training - Base model: Conformer CTC - Fine-tuned on AI4Bharat IndicVoices Marathi dataset - Optimized for real-time transcription ## License & Attribution Please cite AI4Bharat when using this model: https://indicvoices.ai4bharat.org/