--- title: Inspect Evals/bfcl emoji: 📊 colorFrom: blue colorTo: purple sdk: docker sdk_version: "latest" pinned: false --- # bfcl This eval was run using [evaljobs](https://github.com/dvsrepo/evaljobs). ## Command ```bash evaljobs inspect_evals/bfcl \ --model hf-inference-providers/moonshotai/Kimi-K2-Thinking,hf-inference-providers/meta-llama/Llama-3.1-8B-Instruct,hf-inference-providers/openai/gpt-oss-20b,hf-inference-providers/zai-org/GLM-4.6,hf-inference-providers/openai/gpt-oss-120b,hf-inference-providers/deepseek-ai/DeepSeek-V3.2-Exp,hf-inference-providers/meta-llama/Llama-3.2-3B-Instruct,hf-inference-providers/Qwen/Qwen2.5-7B-Instruct,hf-inference-providers/Qwen/Qwen3-4B-Instruct-2507,hf-inference-providers/deepseek-ai/DeepSeek-R1 \ --name bfcl-trending-models ``` ## Run with other models To run this eval with a different model, use: ```bash evaljobs inspect_evals/bfcl \ --model \ --name \ --flavor cpu-basic ``` ## Inspect eval command The eval was executed with: ```bash inspect eval-set inspect_evals/bfcl \ --model hf-inference-providers/moonshotai/Kimi-K2-Thinking,hf-inference-providers/meta-llama/Llama-3.1-8B-Instruct,hf-inference-providers/openai/gpt-oss-20b,hf-inference-providers/zai-org/GLM-4.6,hf-inference-providers/openai/gpt-oss-120b,hf-inference-providers/deepseek-ai/DeepSeek-V3.2-Exp,hf-inference-providers/meta-llama/Llama-3.2-3B-Instruct,hf-inference-providers/Qwen/Qwen2.5-7B-Instruct,hf-inference-providers/Qwen/Qwen3-4B-Instruct-2507,hf-inference-providers/deepseek-ai/DeepSeek-R1 \ --log-shared \ --log-buffer 100 ```