--- license: apache-2.0 language: - ru pipeline_tag: text-generation --- # BulgakovLM 3B A language model trained on Russian. May be suitable for further tuning. The 100 gigabyte dataset consisted primarily of web pages, books, poems, and prose. The model was trained over 2 epochs. Uses GPT-J architecture with a context window of 4k tokens. Trained thanks to a TRC grant on TPU-VM v3-8