alanzhuly commited on
Commit
fcdc671
β€’
1 Parent(s): 5f3ac89

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -7
README.md CHANGED
@@ -15,6 +15,13 @@ OmniAudio is the world's fastest and most efficient audio-language model for on-
15
 
16
  Unlike traditional approaches that chain ASR and LLM models together, OmniAudio-2.6B unifies both capabilities in a single efficient architecture for minimal latency and resource overhead.
17
 
 
 
 
 
 
 
 
18
  ## Use Cases
19
  * **Voice QA without Internet**: Process offline voice queries like "I am at camping, how do I start a fire without fire starter?" OmniAudio provides practical guidance even without network connectivity.
20
  * **Voice-in Conversation**: Have conversations about personal experiences. When you say "I am having a rough day at work," OmniAudio engages in supportive talk and active listening.
@@ -22,15 +29,12 @@ Unlike traditional approaches that chain ASR and LLM models together, OmniAudio-
22
  * **Recording Summary**: Simply ask "Can you summarize this meeting note?" to convert lengthy recordings into concise, actionable summaries.
23
  * **Voice Tone Modification**: Transform casual voice memos into professional communications. When you request "Can you make this voice memo more professional?" OmniAudio adjusts the tone while preserving the core message.
24
 
25
- ## Performance Benchmarks on Consumer Hardware
26
- On a 2024 Mac Mini M4 Pro, **Qwen2-Audio-7B-Instruct** running on πŸ€— Transformers achieves an average decoding speed of 6.38 tokens/second, while **Omni-Audio-2.6B** through Nexa SDK reaches 35.23 tokens/second in FP16 GGUF version and 66 tokens/second in Q4_K_M quantized GGUF version - delivering **5.5x to 10.3x faster performance** on consumer hardware.
27
-
28
  ## Quick Links
29
- 1. Interactive Demo in our [HuggingFace Space]().
30
- 2. [Quickstart for local setup]()
31
- 3. Learn more in our [Blogs]()
32
 
33
- ## Run OmniAudio-2.6B on Your Device
34
  Step 1: Install Nexa-SDK (local on-device inference framework)
35
 
36
  [πŸš€ Install Nexa-SDK](https://github.com/NexaAI/nexa-sdk?tab=readme-ov-file#install-option-1-executable-installer)
 
15
 
16
  Unlike traditional approaches that chain ASR and LLM models together, OmniAudio-2.6B unifies both capabilities in a single efficient architecture for minimal latency and resource overhead.
17
 
18
+ ## Demo
19
+
20
+ <video controls autoplay src="https://cdn-uploads.huggingface.co/production/uploads/6618e0424dbef6bd3c72f89a/538_aQ2hRexTlXFL-cYhW.mp4"></video>
21
+
22
+ ## Performance Benchmarks on Consumer Hardware
23
+ On a 2024 Mac Mini M4 Pro, **Qwen2-Audio-7B-Instruct** running on πŸ€— Transformers achieves an average decoding speed of 6.38 tokens/second, while **Omni-Audio-2.6B** through Nexa SDK reaches 35.23 tokens/second in FP16 GGUF version and 66 tokens/second in Q4_K_M quantized GGUF version - delivering **5.5x to 10.3x faster performance** on consumer hardware.
24
+
25
  ## Use Cases
26
  * **Voice QA without Internet**: Process offline voice queries like "I am at camping, how do I start a fire without fire starter?" OmniAudio provides practical guidance even without network connectivity.
27
  * **Voice-in Conversation**: Have conversations about personal experiences. When you say "I am having a rough day at work," OmniAudio engages in supportive talk and active listening.
 
29
  * **Recording Summary**: Simply ask "Can you summarize this meeting note?" to convert lengthy recordings into concise, actionable summaries.
30
  * **Voice Tone Modification**: Transform casual voice memos into professional communications. When you request "Can you make this voice memo more professional?" OmniAudio adjusts the tone while preserving the core message.
31
 
 
 
 
32
  ## Quick Links
33
+ 1. Interactive Demo in our [HuggingFace Space](https://huggingface.co/spaces/NexaAIDev/omni-audio-demo)
34
+ 2. [Quickstart for local setup](#How-to-Use-On-Device)
35
+ 3. Learn more in our [Blogs](https://nexa.ai/blogs/OmniAudio-2.6B)
36
 
37
+ ## How to Use On Device
38
  Step 1: Install Nexa-SDK (local on-device inference framework)
39
 
40
  [πŸš€ Install Nexa-SDK](https://github.com/NexaAI/nexa-sdk?tab=readme-ov-file#install-option-1-executable-installer)