BulgakovLM 3B

A language model trained on Russian. May be suitable for further tuning. The 100 gigabyte dataset consisted primarily of web pages, books, poems, and prose. The model was trained over 2 epochs.

Uses GPT-J architecture with a context window of 4k tokens.

Trained thanks to a TRC grant on TPU-VM v3-8

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

tokenizer = AutoTokenizer.from_pretrained("0x7o/BulgakovLM-3B")
model = AutoModelForCausalLM.from_pretrained("0x7o/BulgakovLM-3B")

input_ids = tokenizer("Искусственный интеллект - это", return_tensors='pt').to(model.device)["input_ids"]
output = model.generate(input_ids, max_new_tokens=48, do_sample=True, temperature=0.7)
print(tokenizer.decode(output[0]))

Output:

Искусственный интеллект - это всего-навсего программа, которая анализирует данные и решает, насколько тот или иной выбор может оказаться оптимальным. Как и во всех остальных сферах человеческой деятельности, в IT есть свои плюсы и минусы. И если в прошлом веке искусственный интеллект был чем

Evaluation

The results are obtained through the Russian-language benchmark MERA

Total score: 0.198

Задача Результат Метрика
BPS 0.44 Accuracy
LCS 0.118 Accuracy
RCB 0.333 / 0.167 Avg. F1 / Accuracy
USE 0 Grade Norm
RWSD 0.523 Accuracy
PARus 0.498 Accuracy
ruTiE 0.5 Accuracy
MultiQ 0.059 / 0.007 F1-score/EM
ruMMLU 0.25 Accuracy
CheGeKa 0.006 / 0 F1 / EM
ruModAr 0.001 Accuracy
SimpleAr 0.001 Accuracy
ruMultiAr 0.011 Accuracy
MathLogicQA 0.245 Accuracy
ruHumanEval 0 / 0 / 0 pass@k
ruWorldTree 0.265 / 0.246 Avg. F1 / Accuracy
ruOpenBookQA 0.24 / 0.221 Avg. F1 / Accuracy
Downloads last month
717
Safetensors
Model size
2.84B params
Tensor type
FP16
·
BOOL
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.