Update to set use_cache: True which can boost inference performance a fair bit
#1
by
TheBloke
- opened
No description provided.
I believe it gets automatically set like that since it was trained with gradient checkpointing so happy to revert this so it's easier to use for decoding
zpn
changed pull request status to
merged