tomiwa1a commited on
Commit
efe5d70
1 Parent(s): 766b395

update description in README.nd

Browse files
Files changed (1) hide show
  1. README.md +8 -63
README.md CHANGED
@@ -1,5 +1,5 @@
1
  ---
2
- license: mit
3
  tags:
4
  - audio
5
  - automatic-speech-recognition
@@ -9,69 +9,14 @@ inference: false
9
  duplicated_from: philschmid/openai-whisper-endpoint
10
  ---
11
 
12
- # OpenAI [Whisper](https://github.com/openai/whisper) Inference Endpoint example
13
 
14
- > Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multi-task model that can perform multilingual speech recognition as well as speech translation and language identification.
15
 
16
- For more information about the model, license and limitations check the original repository at [openai/whisper](https://github.com/openai/whisper).
 
 
17
 
18
- ---
19
-
20
- This repository implements a custom `handler` task for `automatic-speech-recognition` for 🤗 Inference Endpoints using OpenAIs new Whisper model. The code for the customized pipeline is in the [pipeline.py](https://huggingface.co/philschmid/openai-whisper-endpoint/blob/main/handler.py).
21
-
22
- There is also a [notebook](https://huggingface.co/philschmid/openai-whisper-endpoint/blob/main/create_handler.ipynb) included, on how to create the `handler.py`
23
-
24
- ### Request
25
-
26
- The endpoint expects a binary audio file. Below is a cURL example and a Python example using the `requests` library.
27
-
28
- **curl**
29
-
30
- ```bash
31
- # load audio file
32
- wget https://cdn-media.huggingface.co/speech_samples/sample1.flac
33
-
34
- # run request
35
- curl --request POST \
36
- --url https://{ENDPOINT}/ \
37
- --header 'Content-Type: audio/x-flac' \
38
- --header 'Authorization: Bearer {HF_TOKEN}' \
39
- --data-binary '@sample1.flac'
40
- ```
41
-
42
- **Python**
43
-
44
- ```python
45
- import json
46
- from typing import List
47
- import requests as r
48
- import base64
49
- import mimetypes
50
-
51
- ENDPOINT_URL=""
52
- HF_TOKEN=""
53
-
54
- def predict(path_to_audio:str=None):
55
- # read audio file
56
- with open(path_to_audio, "rb") as i:
57
- b = i.read()
58
- # get mimetype
59
- content_type= mimetypes.guess_type(path_to_audio)[0]
60
-
61
- headers= {
62
- "Authorization": f"Bearer {HF_TOKEN}",
63
- "Content-Type": content_type
64
- }
65
- response = r.post(ENDPOINT_URL, headers=headers, data=b)
66
- return response.json()
67
-
68
- prediction = predict(path_to_audio="sample1.flac")
69
-
70
- prediction
71
-
72
- ```
73
- expected output
74
 
75
- ```json
76
- {"text": " going along slushy country roads and speaking to damp audiences in draughty school rooms day after day for a fortnight. He'll have to put in an appearance at some place of worship on Sunday morning, and he can come to us immediately afterwards."}
77
- ```
 
1
  ---
2
+ license: gpl-3.0
3
  tags:
4
  - audio
5
  - automatic-speech-recognition
 
9
  duplicated_from: philschmid/openai-whisper-endpoint
10
  ---
11
 
12
+ # Video Search
13
 
14
+ This project contains 3 different models that can be used for searching videos.
15
 
16
+ 1. Whisper to convert mp3 files to audio
17
+ 2. BART Sentence Transformer to generate vector embeddings from text
18
+ 3. BART LFQA to generate long form answers given a context
19
 
20
+ For more context, see: [Atlas: Find Anything on Youtube](https://atila.ca/blog/tomiwa/atlas)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
21
 
22
+ Inspired by [philschmid/openai-whisper-endpoint](https://huggingface.co/philschmid/openai-whisper-endpoint)