lang_id_testing / README.md
barto17's picture
Update README.md
a369de3
---
title: Language Identification
emoji: 🔥
colorFrom: green
colorTo: indigo
sdk: gradio
sdk_version: 3.44.4
app_file: app.py
pinned: false
---
This repository contains the code for audio transcription and language identification. Both tasks are connected in one pipeline with two models stacked on top of another:
* Roberta (https://huggingface.co/dominguesm/xlm-roberta-base-lora-language-detection) — Language Detection
* Whisper (https://huggingface.co/openai/whisper-large) — Transcription
Common-Language dataset (https://huggingface.co/datasets/common_language) was used for both tasks.
References to the specific code are included in the main app.py file.
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference