Edit model card

Model Card for Model ID

The base model, Mitral-7B-v1, has been fine-tuned to improve its reasoning, game analysis, and chess understanding capabilities, including proficiency in Algebraic Notation and FEN (Forsyth-Edwards Notation). This enhancement aims to create a robust AI system architecture that can integrate various tools seamlessly, boosting cognitive abilities within the controlled environment of chess.
The full work can be accessed here

Model Description

  • Developed by: Danny Xu, Carlos Kuhn, Muntasir Adnan
  • Funded by: OpenSI
  • Model type: Transformer based
  • License: MIT
  • Finetuned from model: Mistral-7B-v0.1

Model Sources

Uses

Direct Use

  • Chess analysis
  • Meausre cognition qualities in a controlled environment

Downstream Use

  • AGI
  • Cognition capability of AI Systems

How to Get Started with the Model

The model card contains only the LoRA adapter. To use it, load the adapter with the base Mistral model

model = AutoModelForCausalLM.from_pretrained(
    base_model,
    quantization_config=bnb_config
)

lora_repo = "OpenSI/cognitive_AI_finetune_3"
adapter_config = PeftConfig.from_pretrained(lora_repo)
openSI_chess = PeftModel.from_pretrained(model, lora_model_name)

Training Details

Training Data

  • Analysis
  • Probable winner
  • Next move prediction
  • FEN parsing
  • Capture analysis

Training Hyperparameters

  • Training regime:
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_use_double_quant=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16)


model_args = TrainingArguments(
    output_dir="mistral_7b",
    num_train_epochs=3,
    # max_steps=50,
    per_device_train_batch_size=4,
    gradient_accumulation_steps=2,
    gradient_checkpointing=True,
    optim="paged_adamw_32bit",
    logging_steps=20,
    save_strategy="epoch",
    learning_rate=2e-4,
    bf16=True,
    tf32=True,
    max_grad_norm=0.3,
    warmup_ratio=0.03,
    lr_scheduler_type="constant",
    disable_tqdm=False
)

Evaluation

Testing Data

Test dataset can be accessed here - OpenSI Cognitive_AI

Metrics

  • Memory
  • Perception
  • Attention
  • Reasoning
  • Anticipation

Results

Evaluation
Evaluation

Hardware

Nvidia RTX 3090

Citation

@misc{Adnan2024,
    title         = {Unleashing Artificial Cognition: Integrating Multiple AI Systems},
    author        = {Muntasir Adnan and Buddhi Gamage and Zhiwei Xu and Damith Herath and Carlos C. N. Kuhn},
    year          = {2024},
    eprint        = {2408.04910},
    archivePrefix = {arXiv}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference API
Unable to determine this model’s pipeline type. Check the docs .