metadata

library_name: transformers
tags:
  - chess
license: mit
language:
  - en

Model Card for Model ID

The base model, Mitral-7B-v1, has been fine-tuned to improve its reasoning, game analysis, and chess understanding capabilities, including proficiency in Algebraic Notation and FEN (Forsyth-Edwards Notation). This enhancement aims to create a robust AI system architecture that can integrate various tools seamlessly, boosting cognitive abilities within the controlled environment of chess.
The full work can be accessed here

Model Description

Developed by: Danny Xu, Carlos Kuhn, Muntasir Adnan
Funded by: OpenSI
Model type: Transformer based
License: MIT
Finetuned from model: Mistral-7B-v0.1

Model Sources

Repository: https://github.com/TheOpenSI/cognitive_AI_experiments
Paper: Unleashing Artificial Cognition: Integrating Multiple AISystems

Uses

Direct Use

Chess analysis
Meausre cognition qualities in a controlled environment

Downstream Use

AGI
Cognition capability of AI Systems

How to Get Started with the Model

The model card contains only the LoRA adapter. To use it, load the adapter with the base Mistral model

model = AutoModelForCausalLM.from_pretrained(
    base_model,
    quantization_config=bnb_config
)

lora_repo = "OpenSI/cognitive_AI_finetune_3"
adapter_config = PeftConfig.from_pretrained(lora_repo)
openSI_chess = PeftModel.from_pretrained(model, lora_model_name)

Training Details

Training Data

Analysis
Probable winner
Next move prediction
FEN parsing
Capture analysis

Training Hyperparameters

Training regime:

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_use_double_quant=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16)


model_args = TrainingArguments(
    output_dir="mistral_7b",
    num_train_epochs=3,
    # max_steps=50,
    per_device_train_batch_size=4,
    gradient_accumulation_steps=2,
    gradient_checkpointing=True,
    optim="paged_adamw_32bit",
    logging_steps=20,
    save_strategy="epoch",
    learning_rate=2e-4,
    bf16=True,
    tf32=True,
    max_grad_norm=0.3,
    warmup_ratio=0.03,
    lr_scheduler_type="constant",
    disable_tqdm=False
)

Evaluation

Testing Data

Test dataset can be accessed here - OpenSI Cognitive_AI

Metrics

Memory
Perception
Attention
Reasoning
Anticipation

Results

Evaluation

Hardware

Nvidia RTX 3090

Citation

@misc{Adnan2024,
    title         = {Unleashing Artificial Cognition: Integrating Multiple AI Systems},
    author        = {Muntasir Adnan and Buddhi Gamage and Zhiwei Xu and Damith Herath and Carlos C. N. Kuhn},
    year          = {2024},
    eprint        = {2408.04910},
    archivePrefix = {arXiv}
}