# AFRAgent — Epoch 7 checkpoint This model is a checkpoint from **AFRAgent** ([paper](https://arxiv.org/abs/2512.00846), WACV 2026): an Adaptive Feature Renormalization–based GUI agent for smartphone automation, built on InstructBLIP. - **Architecture:** AnyResAdaIn (any-resolution adaptive feature renormalization) - **Base model:** Salesforce/instructblip-flan-t5-xl - **Training:** Fine-tuned on Android-in-the-Wild (AITW) — 7 epochs, `all_data_any_res_adain_finetuning` (bs128, ip512, op256, ep12 run; this is checkpoint at step 56266 ≈ epoch 7) ## How to load Requires the [AFRAgent](https://github.com/neerajanand321/AFRAgent/tree/main) codebase for the custom `AnyResAdaIn` class. ```python # Clone AFRAgent and add to path, then: from models.any_res_adain_queries_fusion import AnyResAdaIn from transformers import InstructBlipProcessor, AutoTokenizer model = AnyResAdaIn.from_pretrained("neeraj321/AFRAgent_pure_multimodel") processor = InstructBlipProcessor.from_pretrained("neeraj321/AFRAgent_pure_multimodel") tokenizer = AutoTokenizer.from_pretrained("neeraj321/AFRAgent_pure_multimodel") ``` For evaluation with the AFRAgent script: ```bash python instructblip_main.py \ --evaluate_dir neeraj321/AFRAgent_pure_multimodel \ --train_any_res_adain True \ --use_high_res True \ --data_root dataset/aitw/general/general \ --input_len 512 --output_len 256 --eval_bs 64 ``` ## License MIT ## Citation ```bibtex @article{anand2025afragent, title={AFRAgent: An Adaptive Feature Renormalization Based High Resolution Aware GUI agent}, author={Anand, Neeraj and others}, journal={WACV}, year={2026} } ```