[FEEDBACK and SHOWCASE] PRO subscription

#13
by osanseviero - opened

Feel free to add your feedback about the Inference API for PRO users as well as other features.

osanseviero pinned discussion

I am getting a very partial response. I am using huggingfacehub like:
repo_id = "meta-llama/Llama-2-70b-chat-hf"
args = {
'temperature': 1,
"max_length":1024
}
HuggingFaceService.llm = HuggingFaceHub(repo_id=repo_id, model_kwargs=args)

prompt is : You are a assistant who can generate the response based on the prompt
Use the following pieces of context to answer the question at the end.
If you don't find the answer, just say Sorry I didn't understand, can you rephrase please.
[Document(page_content='Types of workflow in the DigitalChameleon platform There are two types of workflows that can be created in the platform which include: 1.\tConversation: A series of nodes with questions or text displayed to the customer in a sequence one by one, to capture the response of Customer, is referred to as Conversation workflow. The nodes of a workflow of conversation type are loaded on the webpage to the customer one at a time. The flow can be modified to return to a previous flow or allow customer to resume work at a later point in time. Workflow will go to the next node only when the customer performs the desired action in the previous node as configured in the workflow. 2.\tForm: A one time loading of the nodes/questions/messages to the end customer all at once in the UI of a form. The form will be created in the similar manner as we create for conversation in the CMS except for the workflow type in the journey properties should be selected as Form while creating/copying the workflow.
Question: explain the Types of workflow in the DigitalChameleon platform

result : "result": ". \n ')]] Sure, I'd be happy to explain the types of"

I am using langchain to get the answers based on a text file.

Hugging Face org

@it-chameleoncx can you format your post with codeblocks (```) thanks

Have any models been added for use by Pro users beyond the four listed in the announcement blog post, such as current top Code Llama derivative from Phind, or if I wanted to use that with llm-vscode would I still need to pay for my own inference endpoint?

Are you planning to add more models to PRO interfaces like for example teknium/OpenHermes-2.5-Mistral-7B?

Hi,
Please add PRO interface for mistralai/Mixtral-8x7B-Instruct-v0.1. It would also be nice to have interfaces for other models that are available through HuggingChat and are not available for PRO subscribers.
Thank you ๐Ÿ™‚

Hi, can you please provide a link to a privacy policy that applies to the PRO Inference API?

Hello, can you please add https://huggingface.co/aaditya/Llama3-OpenBioLLM-70B in the pro subscription.

Sorry, I didn't understand are any limiations for requests in PRO / Free accounts, like limit of tokens

I am trying to access meta-llama/Llama-2-70b-chat-hf, which was previously available as a PRO subscriber, but it seems the model does not respond.
Can you please reactivate it?

I can't apply on spaces.GPU on async funtions. I can't apply spaces.GPU in wrapped functions. It would be nice if both would be possible.

Hello Hugging Face Support Team,

Iโ€™m interested in using the models available through the PRO subscription and have reviewed the details on Inference for PRO in the following link: https://huggingface.co/blog/inference-pro.

Specifically, I would like to use the following model:
https://api-inference.huggingface.co/models/openai/whisper-large-v3-turbo

Iโ€™d like to know what the monthly usage limits are for this model under the PRO subscription. Specifically, how many requests can I make in a month, and what other limitations might apply?

Could you please provide detailed information regarding rate limits, monthly request quotas, response times, and any other restrictions associated with the PRO plan?

Thank you for your assistance.

Sign up or log in to comment