lvwerra
/

starcoderbase-gsm8k

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Edit model card

starcoderbase-triviaqa

This model is baesed on https://huggingface.co/bigcode/starcoderbase and is fine-tuned on the GSM8K dataset using reinforcement learning via TRL's TextEnvironment (https://github.com/huggingface/trl/pull/424).

Out of Scope Use

Replacing human expertise

Bias, Risks, and Limitations

May generate answers that are incorrect or misleading.
May copy answers from the training data verbatim.
May generate language that is hateful or promotes discrimination (example).
May generate language that is offensive to direct or indirect users or to people or groups mentioned.

Recommendations

Answers should be validated through the use of external sources.
Disparities between the data contributors and the direct and indirect users of the technology should inform developers in assessing what constitutes an appropriate use case.
Further research is needed to attribute model generations to sources in the training data, especially in cases where the model copies answers from the training data.

Downloads last month: 12

Safetensors

Model size

15.5B params

Tensor type

F32

·

Inference Examples

Text Generation

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train lvwerra/starcoderbase-gsm8k

Space using lvwerra/starcoderbase-gsm8k 1