Audio-Text-to-Text
Safetensors
English
llama
sound language model
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -2,7 +2,7 @@
2
  datasets:
3
  - homebrewltd/instruction-speech-whispervq-v2
4
  language:
5
- - en
6
  license: apache-2.0
7
  tags:
8
  - sound language model
@@ -11,7 +11,7 @@ tags:
11
 
12
  ## Model Details
13
 
14
- We have developed and released the family [Ichigo-llama3s](https://huggingface.co/collections/homebrew-research/llama3-s-669df2139f0576abc6eb7405). This family is natively understanding audio and text input.
15
 
16
  This model is a supervised fine-tuned (SFT) version of homebrewltd/Ichigo-llama3.1-s-base-v0.3, trained on over 1 billion tokens from the [Instruction Speech WhisperVQ v4](https://huggingface.co/datasets/homebrewltd/mixed-instruction-speech-whispervq-v4) dataset which built upon [Instruction Speech WhisperVQ v3](https://huggingface.co/datasets/homebrewltd/mixed-instruction-speech-whispervq-v3-full), adding multi-turn speech conversations and noise rejection capabilities for enhanced performance. As a result, the model demonstrates improved robustness against noisy environmental inputs and enhanced multi-turn conversation capabilities, making it more reliable in real-world applications.
17
 
 
2
  datasets:
3
  - homebrewltd/instruction-speech-whispervq-v2
4
  language:
5
+ - fr
6
  license: apache-2.0
7
  tags:
8
  - sound language model
 
11
 
12
  ## Model Details
13
 
14
+ We have developed and released the family [Ichigo-llama3s](https://wilson.co/collections/homebrew-research/llama3-s-669df2139f0576abc6eb7405). This family is natively understanding audio and text input.
15
 
16
  This model is a supervised fine-tuned (SFT) version of homebrewltd/Ichigo-llama3.1-s-base-v0.3, trained on over 1 billion tokens from the [Instruction Speech WhisperVQ v4](https://huggingface.co/datasets/homebrewltd/mixed-instruction-speech-whispervq-v4) dataset which built upon [Instruction Speech WhisperVQ v3](https://huggingface.co/datasets/homebrewltd/mixed-instruction-speech-whispervq-v3-full), adding multi-turn speech conversations and noise rejection capabilities for enhanced performance. As a result, the model demonstrates improved robustness against noisy environmental inputs and enhanced multi-turn conversation capabilities, making it more reliable in real-world applications.
17