|
--- |
|
language: |
|
- en |
|
license: mit |
|
tags: |
|
- convAI |
|
- conversational |
|
- ASR |
|
license_link: https://huggingface.co/microsoft/phi-2/resolve/main/LICENSE |
|
widget: |
|
- text: Hello who are you? |
|
example_title: Identity |
|
- text: What can you do? |
|
example_title: Capabilities |
|
- text: Create a fastapi endpoint to retrieve the weather given a zip code. |
|
example_title: Coding |
|
pipeline_tag: text-generation |
|
--- |
|
|
|
# Disclaimer |
|
|
|
THIS PROJECT IS STILL IN WIP |
|
|
|
# Phi-2-audio-super |
|
|
|
Base Model: [microsoft/phi-2](https://huggingface.co/microsoft/phi-2) |
|
|
|
Fine-tuned version of [abacaj/phi-2-super](https://huggingface.co/abacaj/phi-2-super) for ASR on [librispeech_asr](https://huggingface.co/datasets/librispeech_asr). |
|
|
|
## How to run inference for text only: |
|
|
|
```python |
|
import transformers |
|
import torch |
|
|
|
if __name__ == "__main__": |
|
model_name = "Thytu/phi-2-audio-super" |
|
tokenizer = transformers.AutoTokenizer.from_pretrained(model_name) |
|
|
|
model = ( |
|
transformers.AutoModelForCausalLM.from_pretrained( |
|
model_name, |
|
) |
|
.to("cuda:0") |
|
.eval() |
|
) |
|
|
|
# Exactly like for phi-2-super :D |
|
messages = [ |
|
{"role": "user", "content": "Hello, who are you?"} |
|
] |
|
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device) |
|
input_ids_cutoff = inputs.size(dim=1) |
|
|
|
with torch.no_grad(): |
|
generated_ids = model.generate( |
|
input_ids=inputs, |
|
use_cache=True, |
|
max_new_tokens=512, |
|
temperature=0.2, |
|
top_p=0.95, |
|
do_sample=True, |
|
eos_token_id=tokenizer.eos_token_id, |
|
pad_token_id=tokenizer.pad_token_id, |
|
) |
|
|
|
completion = tokenizer.decode( |
|
generated_ids[0][input_ids_cutoff:], |
|
skip_special_tokens=True, |
|
) |
|
|
|
print(completion) |
|
``` |
|
|
|
## How to run inference for ASR: |
|
TODO |