@Wauplin on Hugging Face: "🚀 I'm excited to announce that huggingface

Wauplin

posted an update Jul 11

Post

3340

🚀 I'm excited to announce that huggingface_hub's InferenceClient now supports OpenAI's Python client syntax! For developers integrating AI into their codebases, this means you can switch to open-source models with just three lines of code. Here's a quick example of how easy it is.

Why use the InferenceClient?
🔄 Seamless transition: keep your existing code structure while leveraging LLMs hosted on the Hugging Face Hub.
🤗 Direct integration: easily launch a model to run inference using our Inference Endpoint service.
🚀 Stay Updated: always be in sync with the latest Text-Generation-Inference (TGI) updates.

More details in https://huggingface.co/docs/huggingface_hub/main/en/guides/inference#openai-compatibility

awacke1

Jul 12

Such good news thanks! With this we can now create AI pipelines with much greater simplicity to make models interchangeable service parts. I think for cutting edge techniques like MoE gating networks, Self Reward and Comparison across models, Memory across AI pipelines, etc this becomes the differentiator to make it all much easier. I hope that by operating key models like GPT-4o, Claude 3.5 Sonnet, Gemma, Llama, and other front runners in this open pattern unlocks better more powerful AI coding patterns.

BarraHome

Jul 15

What's the benefit? @Wauplin

Wauplin

Jul 16

Mostly that it's better integrated with HF services. If you pass a model_id you can use the serverless Inference API without setting an base_url. No need either to pass an api_key if you are already logged in (with $HF_TOKEN environment variable or huggingface-cli login). If you are an Inference Endpoint user (i.e. deploying a model using https://ui.endpoints.huggingface.co/), you get a seamless integration to make requests to it with URL already configured. Finally, you are assured that the client will stay up to date with latest updates in TGI/Inference API/Inference Endpoints.

Join the conversation