alanzhuly commited on
Commit
e3104df
1 Parent(s): 598a0e1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -0
README.md CHANGED
@@ -8,6 +8,8 @@ tags:
8
  - audio
9
  - GGUF
10
  ---
 
 
11
  # OmniAudio-2.6B
12
  OmniAudio is the world's fastest and most efficient audio-language model for on-device deployment - a 2.6B-parameter multimodal model that processes both text and audio inputs. It integrates three components: **Gemma-2-2b**, **Whisper turbo**, and a custom projector module, enabling secure, responsive audio-text processing directly on edge devices.
13
  Unlike traditional approaches that chain ASR and LLM models together, OmniAudio-2.6B unifies both capabilities in a single efficient architecture for minimal latency and resource overhead.
 
8
  - audio
9
  - GGUF
10
  ---
11
+ <img src="https://cdn-uploads.huggingface.co/production/uploads/6618e0424dbef6bd3c72f89a/d7Rzpm0cgCToXjtE7_U2u.png" alt="Example" style="width:400px;"/>
12
+
13
  # OmniAudio-2.6B
14
  OmniAudio is the world's fastest and most efficient audio-language model for on-device deployment - a 2.6B-parameter multimodal model that processes both text and audio inputs. It integrates three components: **Gemma-2-2b**, **Whisper turbo**, and a custom projector module, enabling secure, responsive audio-text processing directly on edge devices.
15
  Unlike traditional approaches that chain ASR and LLM models together, OmniAudio-2.6B unifies both capabilities in a single efficient architecture for minimal latency and resource overhead.