--- library_name: transformers tags: - chess license: mit language: - en --- # Model Card for Model ID The base model, Mitral-7B-v1, has been fine-tuned to improve its reasoning, game analysis, and chess understanding capabilities, including proficiency in Algebraic Notation and FEN (Forsyth-Edwards Notation). This enhancement aims to create a robust AI system architecture that can integrate various tools seamlessly, boosting cognitive abilities within the controlled environment of chess. The full work can be accessed [here](__link__to__add__) ### Model Description - **Developed by:** Danny Xu, Carlos Kuhn, Muntasir Adnan - **Funded by:** OpenSI - **Model type:** Transformer based - **License:** MIT - **Finetuned from model:** Mistral-7B-v0.1 - ### Model Sources - **Repository:** https://github.com/TheOpenSI/cognitive_AI_experiments - **Paper:** [Unleashing Artificial Cognition: Integrating Multiple AISystems](__link__to__add__) ## Uses ### Direct Use - Chess analysis - Meausre cognition qualities in a controlled environment ### Downstream Use - AGI - Cognition capability of AI Systems ## How to Get Started with the Model The model card contains only the LoRA adapter. To use it, load the adapter with the base Mistral model ``` model = AutoModelForCausalLM.from_pretrained( base_model, quantization_config=bnb_config ) lora_repo = "OpenSI/cognitive_AI_finetune_3" adapter_config = PeftConfig.from_pretrained(lora_repo) openSI_chess = PeftModel.from_pretrained(model, lora_model_name) ``` ## Training Details ### Training Data - Analysis - Probable winner - Next move prediction - FEN parsing - Capture analysis #### Training Hyperparameters - **Training regime:** ``` bnb_config = BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_use_double_quant=True, bnb_4bit_quant_type="nf4", bnb_4bit_compute_dtype=torch.bfloat16) model_args = TrainingArguments( output_dir="mistral_7b", num_train_epochs=3, # max_steps=50, per_device_train_batch_size=4, gradient_accumulation_steps=2, gradient_checkpointing=True, optim="paged_adamw_32bit", logging_steps=20, save_strategy="epoch", learning_rate=2e-4, bf16=True, tf32=True, max_grad_norm=0.3, warmup_ratio=0.03, lr_scheduler_type="constant", disable_tqdm=False ) ``` ## Evaluation #### Testing Data Test dataset can be accessed here - [OpenSI Cognitive_AI](https://github.com/TheOpenSI/cognitive_AI_experiments/tree/master/data/test_framework) #### Metrics - Memory - Perception - Attention - Reasoning - Anticipation ### Results
Evaluation
Evaluation
#### Hardware Nvidia RTX 3090 ## Citation ``` @misc{Adnan2024, title = {Unleashing Artificial Cognition: Integrating Multiple AI Systems}, author = {Muntasir Adnan and Buddhi Gamage and Zhiwei Xu and Damith Herath and Carlos C. N. Kuhn}, year = {2024}, eprint = {2408.04910}, archivePrefix = {arXiv} } ```