Model Card for Model ID
The base model, Mitral-7B-v1, has been fine-tuned to improve its reasoning, game analysis, and chess understanding capabilities, including proficiency in Algebraic Notation and FEN (Forsyth-Edwards Notation). This enhancement aims to create a robust AI system architecture that can integrate various tools seamlessly, boosting cognitive abilities within the controlled environment of chess.
The full work can be accessed here
Model Description
- Developed by: Danny Xu, Carlos Kuhn, Muntasir Adnan
- Funded by: OpenSI
- Model type: Transformer based
- License: MIT
- Finetuned from model: Mistral-7B-v0.1
Model Sources
- Repository: https://github.com/TheOpenSI/cognitive_AI_experiments
- Paper: Unleashing Artificial Cognition: Integrating Multiple AISystems
Uses
Direct Use
- Chess analysis
- Meausre cognition qualities in a controlled environment
Downstream Use
- AGI
- Cognition capability of AI Systems
How to Get Started with the Model
The model card contains only the LoRA adapter. To use it, load the adapter with the base Mistral model
model = AutoModelForCausalLM.from_pretrained(
base_model,
quantization_config=bnb_config
)
lora_repo = "OpenSI/cognitive_AI_finetune_3"
adapter_config = PeftConfig.from_pretrained(lora_repo)
openSI_chess = PeftModel.from_pretrained(model, lora_model_name)
Training Details
Training Data
- Analysis
- Probable winner
- Next move prediction
- FEN parsing
- Capture analysis
Training Hyperparameters
- Training regime:
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_use_double_quant=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.bfloat16)
model_args = TrainingArguments(
output_dir="mistral_7b",
num_train_epochs=3,
# max_steps=50,
per_device_train_batch_size=4,
gradient_accumulation_steps=2,
gradient_checkpointing=True,
optim="paged_adamw_32bit",
logging_steps=20,
save_strategy="epoch",
learning_rate=2e-4,
bf16=True,
tf32=True,
max_grad_norm=0.3,
warmup_ratio=0.03,
lr_scheduler_type="constant",
disable_tqdm=False
)
Evaluation
Testing Data
Test dataset can be accessed here - OpenSI Cognitive_AI
Metrics
- Memory
- Perception
- Attention
- Reasoning
- Anticipation
Results
Evaluation |
---|
Hardware
Nvidia RTX 3090
Citation
@misc{Adnan2024,
title = {Unleashing Artificial Cognition: Integrating Multiple AI Systems},
author = {Muntasir Adnan and Buddhi Gamage and Zhiwei Xu and Damith Herath and Carlos C. N. Kuhn},
year = {2024},
eprint = {2408.04910},
archivePrefix = {arXiv}
}