gemma 2b inference Endpoints error

#46
by gawon16 - opened

My Inference endpoint is unable to start, I’m getting this error:

Endpoint encountered an error.
You can try restarting it using the "pause" button above. Check logs for more details.
Server message:Endpoint failed to start.

What can I do to get this running? I repeat to pause and initialize, but got a same error.

Help anyone?

Google org

Hi there. Which error did you get in the logs?

Hi there. Which error did you get in the logs?

2024/04/11 16:23:09 ~ {"timestamp":"2024-04-11T07:23:09.599170Z","level":"INFO","fields":{"message":"Args { model_id: "/repository", revision: None, validation_workers: 2, sharded: None, num_shard: None, quantize: None, speculate: None, dtype: None, trust_remote_code: false, max_concurrent_requests: 128, max_best_of: 2, max_stop_sequences: 4, max_top_n_tokens: 5, max_input_length: 1024, max_total_tokens: 1512, waiting_served_ratio: 1.2, max_batch_prefill_tokens: 2048, max_batch_total_tokens: None, max_waiting_tokens: 20, max_batch_size: None, cuda_graphs: [1, 2, 4, 8, 16, 32, 64, 96, 128], hostname: "r-hwgi-gemma-2b-efr-2db7c6ju-02fac-sw7e3", port: 80, shard_uds_path: "/tmp/text-generation-server", master_addr: "localhost", master_port: 29500, huggingface_hub_cache: Some("/data"), weights_cache_override: None, disable_custom_kernels: false, cuda_memory_fraction: 1.0, rope_scaling: None, rope_factor: None, json_output: true, otlp_endpoint: None, cors_allow_origin: [], watermark_gamma: None, watermark_delta: None, ngrok: false, ngrok_authtoken: None, ngrok_edge: None, tokenizer_config_path: None, disable_grammar_support: false, env: false }"},"target":"text_generation_launcher"}
2024/04/11 16:23:09 ~ {"timestamp":"2024-04-11T07:23:09.599219Z","level":"INFO","fields":{"message":"Sharding model on 4 processes"},"target":"text_generation_launcher"}
2024/04/11 16:23:09 ~ {"timestamp":"2024-04-11T07:23:09.599305Z","level":"INFO","fields":{"message":"Starting download process."},"target":"text_generation_launcher","span":{"name":"download"},"spans":[{"name":"download"}]}
2024/04/11 16:23:11 ~ {"timestamp":"2024-04-11T07:23:11.917102Z","level":"INFO","fields":{"message":"Files are already present on the host. Skipping download.\n"},"target":"text_generation_launcher"}
2024/04/11 16:23:12 ~ {"timestamp":"2024-04-11T07:23:12.402047Z","level":"INFO","fields":{"message":"Successfully downloaded weights."},"target":"text_generation_launcher","span":{"name":"download"},"spans":[{"name":"download"}]}
2024/04/11 16:23:12 ~ {"timestamp":"2024-04-11T07:23:12.402305Z","level":"INFO","fields":{"message":"Starting shard"},"target":"text_generation_launcher","span":{"rank":1,"name":"shard-manager"},"spans":[{"rank":1,"name":"shard-manager"}]}
2024/04/11 16:23:12 ~ {"timestamp":"2024-04-11T07:23:12.402306Z","level":"INFO","fields":{"message":"Starting shard"},"target":"text_generation_launcher","span":{"rank":0,"name":"shard-manager"},"spans":[{"rank":0,"name":"shard-manager"}]}
2024/04/11 16:23:12 ~ {"timestamp":"2024-04-11T07:23:12.402394Z","level":"INFO","fields":{"message":"Starting shard"},"target":"text_generation_launcher","span":{"rank":2,"name":"shard-manager"},"spans":[{"rank":2,"name":"shard-manager"}]}
2024/04/11 16:23:12 ~ {"timestamp":"2024-04-11T07:23:12.402847Z","level":"INFO","fields":{"message":"Starting shard"},"target":"text_generation_launcher","span":{"rank":3,"name":"shard-manager"},"spans":[{"rank":3,"name":"shard-manager"}]}
2024/04/11 16:23:12 ~ {"timestamp":"2024-04-11T07:23:12.502431Z","level":"INFO","fields":{"message":"Shutting down shards"},"target":"text_generation_launcher"}
2024/04/11 16:23:12 ~ {"timestamp":"2024-04-11T07:23:12.507161Z","level":"INFO","fields":{"message":"Shard terminated"},"target":"text_generation_launcher","span":{"rank":0,"name":"shard-manager"},"spans":[{"rank":0,"name":"shard-manager"}]}
2024/04/11 16:23:12 ~ {"timestamp":"2024-04-11T07:23:12.507365Z","level":"INFO","fields":{"message":"Shard terminated"},"target":"text_generation_launcher","span":{"rank":1,"name":"shard-manager"},"spans":[{"rank":1,"name":"shard-manager"}]}
2024/04/11 16:23:12 ~ {"timestamp":"2024-04-11T07:23:12.507543Z","level":"INFO","fields":{"message":"Shard terminated"},"target":"text_generation_launcher","span":{"rank":2,"name":"shard-manager"},"spans":[{"rank":2,"name":"shard-manager"}]}
2024/04/11 16:23:12 ~ {"timestamp":"2024-04-11T07:23:12.507608Z","level":"INFO","fields":{"message":"Shard terminated"},"target":"text_generation_launcher","span":{"rank":3,"name":"shard-manager"},"spans":[{"rank":3,"name":"shard-manager"}]}

Here you are.

Hi @gawon16 , Thanks for reporting. We recommend updating the task in your endpoint from Question-Answering to Text-Generation. If you continue to run into an issue deploying or have further questions, please email us at api-enterprise@huggingface.co and we'll take a deeper look. Thanks again!

Google org

@gawon16 , I hope the issue has been resolved. Please let us know if any further assistance is needed. Thanks!

Sign up or log in to comment