tomiwa1a
/

video-search

Automatic Speech Recognition

endpoints-template

Inference Endpoints

Model card Files Files and versions Community

tomiwa1a commited on Jan 21, 2023

Commit

efe5d70

•

1 Parent(s): 766b395

update description in README.nd

Files changed (1) hide show

README.md +8 -63

README.md CHANGED Viewed

@@ -1,5 +1,5 @@
 ---
-license: mit
 tags:
 - audio
 - automatic-speech-recognition
@@ -9,69 +9,14 @@ inference: false
 duplicated_from: philschmid/openai-whisper-endpoint
 ---
-# OpenAI [Whisper](https://github.com/openai/whisper) Inference Endpoint example
-> Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multi-task model that can perform multilingual speech recognition as well as speech translation and language identification.
-For more information about the model, license and limitations check the original repository at [openai/whisper](https://github.com/openai/whisper).
----
-This repository implements a custom `handler` task for `automatic-speech-recognition` for 🤗 Inference Endpoints using OpenAIs new Whisper model. The code for the customized pipeline is in the [pipeline.py](https://huggingface.co/philschmid/openai-whisper-endpoint/blob/main/handler.py).
-There is also a [notebook](https://huggingface.co/philschmid/openai-whisper-endpoint/blob/main/create_handler.ipynb) included, on how to create the `handler.py`
-###  Request
-The endpoint expects a binary audio file. Below is a cURL example and a Python example using the `requests` library.
-**curl**
-```bash
-# load audio file
-wget https://cdn-media.huggingface.co/speech_samples/sample1.flac
-# run request
-curl --request POST \
-  --url https://{ENDPOINT}/ \
-  --header 'Content-Type: audio/x-flac' \
-  --header 'Authorization: Bearer {HF_TOKEN}' \
-  --data-binary '@sample1.flac'
-```
-**Python**
-```python
-import json
-from typing import List
-import requests as r
-import base64
-import mimetypes
-ENDPOINT_URL=""
-HF_TOKEN=""
-def predict(path_to_audio:str=None):
-    # read audio file
-    with open(path_to_audio, "rb") as i:
-      b = i.read()
-    # get mimetype
-    content_type= mimetypes.guess_type(path_to_audio)[0]
-    headers= {
-        "Authorization": f"Bearer {HF_TOKEN}",
-        "Content-Type": content_type
-    }
-    response = r.post(ENDPOINT_URL, headers=headers, data=b)
-    return response.json()
-prediction = predict(path_to_audio="sample1.flac")
-prediction
-```
-expected output
-```json
-{"text": " going along slushy country roads and speaking to damp audiences in draughty school rooms day after day for a fortnight. He'll have to put in an appearance at some place of worship on Sunday morning, and he can come to us immediately afterwards."}
-```

 ---
+license: gpl-3.0
 tags:
 - audio
 - automatic-speech-recognition
 duplicated_from: philschmid/openai-whisper-endpoint
 ---
+# Video Search
+This project contains 3 different models that can be used for searching videos.
+1. Whisper to convert mp3 files to audio
+2. BART Sentence Transformer to generate vector embeddings from text
+3. BART LFQA to generate long form answers given a context
+For more context, see: [Atlas: Find Anything on Youtube](https://atila.ca/blog/tomiwa/atlas)
+Inspired by [philschmid/openai-whisper-endpoint](https://huggingface.co/philschmid/openai-whisper-endpoint)