fhswf
/

BPE_GPT2_TinyStoriesV2_cleaned_4096

text generation

Model card Files Files and versions Community

BPE Tokenizer for TinyStoriesV2

Based on get-neo BPE Tokenizer, but with a smaller vocabulary. Trained with TinyStoriesV2.

Vocab Size: 4096
256 Base chars
1 extra Token: <|endoftext|>
3839 merges

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model is not currently available via any of the supported Inference Providers.

The model cannot be deployed to the HF Inference API: The model has no library tag.

Dataset used to train fhswf/BPE_GPT2_TinyStoriesV2_cleaned_4096